提交 · f6b9664f8b711cf4fd53e70aa0d21f72d5bf806c · openeuler / raspberrypi-kernel

02 3月, 2011 4 次提交

由 Herbert Xu 提交于 3月 01, 2011

This patch converts UDP to use the new ip_finish_skb API.  This
would then allows us to more easily use ip_make_skb which allows
UDP to run without a socket lock.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6b9664f

inet: Add ip_make_skb and ip_finish_skb · 1c32c5ad

由 Herbert Xu 提交于 3月 01, 2011

This patch adds the helper ip_make_skb which is like ip_append_data
and ip_push_pending_frames all rolled into one, except that it does
not send the skb produced.  The sending part is carried out by
ip_send_skb, which the transport protocol can call after it has
tweaked the skb.

It is meant to be called in cases where corking is not used should
have a one-to-one correspondence to sendmsg.

This patch also adds the helper ip_finish_skb which is meant to
be replace ip_push_pending_frames when corking is required.
Previously the protocol stack would peek at the socket write
queue and add its header to the first packet.  With ip_finish_skb,
the protocol stack can directly operate on the final skb instead,
just like the non-corking case with ip_make_skb.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c32c5ad

inet: Remove explicit write references to sk/inet in ip_append_data · 1470ddf7

由 Herbert Xu 提交于 3月 01, 2011

In order to allow simultaneous calls to ip_append_data on the same
socket, it must not modify any shared state in sk or inet (other
than those that are designed to allow that such as atomic counters).

This patch abstracts out write references to sk and inet_sk in
ip_append_data and its friends so that we may use the underlying
code in parallel.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1470ddf7

inet: Remove unused sk_sndmsg_* from UFO · 5a2ef920

由 Herbert Xu 提交于 3月 01, 2011

UFO doesn't really use the sk_sndmsg_* parameters so touching
them is pointless.  It can't use them anyway since the whole
point of UFO is to use the original pages without copying.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a2ef920

25 2月, 2011 1 次提交

ipv4: Rearrange how ip_route_newports() gets port keys. · dca8b089

由 David S. Miller 提交于 2月 24, 2011

ip_route_newports() is the only place in the entire kernel that
cares about the port members in the routing cache entry's lookup
flow key.

Therefore the only reason we store an entire flow inside of the
struct rtentry is for this one special case.

Rewrite ip_route_newports() such that:

1) The caller passes in the original port values, so we don't need
   to use the rth->fl.fl_ip_{s,d}port values to remember them.

2) The lookup flow is constructed by hand instead of being copied
   from the routing cache entry's flow.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dca8b089

24 2月, 2011 2 次提交
- D
  xfrm: Const'ify address arguments to ->dst_lookup() · 5e6b930f
  由 David S. Miller 提交于 2月 24, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  5e6b930f
- D
  xfrm: Const'ify tmpl and address arguments to ->init_temprop() · 19bd6244
  由 David S. Miller 提交于 2月 24, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  19bd6244
23 2月, 2011 3 次提交
- D
  xfrm: Mark flowi arg to ->init_tempsel() const. · 73e5ebb2
  由 David S. Miller 提交于 2月 22, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  73e5ebb2
- D
  xfrm: Mark flowi arg to ->fill_dst() const. · 0c7b3eef
  由 David S. Miller 提交于 2月 22, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  0c7b3eef
- D
  xfrm: Mark flowi arg to ->get_tos() const. · 05d84025
  由 David S. Miller 提交于 2月 22, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  05d84025
21 2月, 2011 1 次提交

tcp: Remove debug macro of TCP_CHECK_TIMER · 089c3482

由 Shan Wei 提交于 2月 19, 2011

Now, TCP_CHECK_TIMER is not used for debuging, it does nothing.
And, it has been there for several years, maybe 6 years.

Remove it to keep code clearer.
Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

089c3482

20 2月, 2011 1 次提交

tcp: fix inet_twsk_deschedule() · 91035f0b

由 Eric Dumazet 提交于 2月 18, 2011

Eric W. Biederman reported a lockdep splat in inet_twsk_deschedule()

This is caused by inet_twsk_purge(), run from process context,
and commit 575f4cd5 (net: Use rcu lookups in inet_twsk_purge.)
removed the BH disabling that was necessary.

Add the BH disabling but fine grained, right before calling
inet_twsk_deschedule(), instead of whole function.

With help from Linus Torvalds and Eric W. Biederman
Reported-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Daniel Lezcano <daniel.lezcano@free.fr>
CC: Pavel Emelyanov <xemul@openvz.org>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
CC: stable <stable@kernel.org> (# 2.6.33+)
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

91035f0b

19 2月, 2011 3 次提交

D
ipv4: Implement __ip_dev_find using new interface address hash. · 9435eb1c
由 David S. Miller 提交于 2月 18, 2011
```
Much quicker than going through the FIB tables.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
9435eb1c

ipv4: Add hash table of interface addresses. · fd23c3b3

由 David S. Miller 提交于 2月 18, 2011

This will be used to optimize __ip_dev_find() and friends.

With help from Eric Dumazet.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd23c3b3

net: provide default_advmss() methods to blackhole dst_ops · 214f45c9

由 Eric Dumazet 提交于 2月 18, 2011

Commit 0dbaee3b (net: Abstract default ADVMSS behind an
accessor.) introduced a possible crash in tcp_connect_init(), when
dst->default_advmss() is called from dst_metric_advmss()
Reported-by: NGeorge Spelvin <linux@horizon.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

214f45c9

18 2月, 2011 6 次提交

ipv4: Use const'ify fib_result deep in the route call chains. · 982721f3

由 David S. Miller 提交于 2月 16, 2011

The only troublesome bit here is __mkroute_output which wants
to override res->fi and res->type, compute those in local
variables instead.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

982721f3

ipv4: Avoid use of signed integers in fib_trie code. · 3b004569

由 David S. Miller 提交于 2月 16, 2011

GCC emits all kinds of crazy zero extensions when we go from signed
int, to unsigned short, etc. etc.

This transformation has to be legal because:

1) In tkey_extract_bits() in mask_pfx(), the values are used to
   perform shifts, on which negative values are undefined by C.

2) In fib_table_lookup() we perform comparisons with unsigned
   values, constants, and additions.  None of which should
   encounter negative values.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b004569

net: Add initial_ref arg to dst_alloc(). · 3c7bd1a1

由 David S. Miller 提交于 2月 16, 2011

This allows avoiding multiple writes to the initial __refcnt.

The most simplest cases of wanting an initial reference of "1"
in ipv4 and ipv6 have been converted, the rest have been left
along and kept at the existing "0".
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3c7bd1a1

ipv4: Consolidate ipv4 dst allocation logic. · 0c4dcd58

由 David S. Miller 提交于 2月 17, 2011

This also allows us to combine all the dst->flags settings and avoid
read/modify/write sequences to this struct member.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c4dcd58

D
ipv4: Move rcu_read_{lock,unlock}() into ip_route_output_slow(). · 010c2708
由 David S. Miller 提交于 2月 17, 2011
```
Simplifies tail of __ip_route_output_key().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
010c2708

ipv4: Simplify output route creation call sequence. · 5ada5527

由 David S. Miller 提交于 2月 17, 2011

There's a lot of redundancy and unnecessary stack frames
in the output route creation path.

1) Make __mkroute_output() return error pointers.

2) Eliminate ip_mkroute_output() entirely, made possible by #1.

3) Call __mkroute_output() directly and handling the returning error
   pointers in ip_route_output_slow().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ada5527

15 2月, 2011 4 次提交

ipv4: Cache learned redirect information in inetpeer. · f39925db

由 David S. Miller 提交于 2月 09, 2011

Note that we do not generate the redirect netevent any longer,
because we don't create a new cached route.

Instead, once the new neighbour is bound to the cached route,
we emit a neigh update event instead.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f39925db

ipv4: Cache learned PMTU information in inetpeer. · 2c8cec5c

由 David S. Miller 提交于 2月 09, 2011

The general idea is that if we learn new PMTU information, we
bump the peer genid.

This triggers the dst_ops->check() code to validate and if
necessary propagate the new PMTU value into the metrics.

Learned PMTU information self-expires.

This means that it is not necessary to kill a cached route
entry just because the PMTU information is too old.

As a consequence:

1) When the path appears unreachable (dst_ops->link_failure
   or dst_ops->negative_advice) we unwind the PMTU state if
   it is out of date, instead of killing the cached route.

   A redirected route will still be invalidated in these
   situations.

2) rt_check_expire(), rt_worker_func(), et al. are no longer
   necessary at all.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c8cec5c

arp_notify: unconditionally send gratuitous ARP for NETDEV_NOTIFY_PEERS. · d11327ad

由 Ian Campbell 提交于 2月 11, 2011

NETDEV_NOTIFY_PEER is an explicit request by the driver to send a link
notification while NETDEV_UP/NETDEV_CHANGEADDR generate link
notifications as a sort of side effect.

In the later cases the sysctl option is present because link
notification events can have undesired effects e.g. if the link is
flapping. I don't think this applies in the case of an explicit
request from a driver.

This patch makes NETDEV_NOTIFY_PEER unconditional, if preferred we
could add a new sysctl for this case which defaults to on.

This change causes Xen post-migration ARP notifications (which cause
switches to relearn their MAC tables etc) to be sent by default.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d11327ad

ipv4: fix rcu lock imbalance in fib_select_default() · 31d40937

由 Eric Dumazet 提交于 2月 14, 2011

Commit 0c838ff1 (ipv4: Consolidate all default route selection
implementations.) forgot to remove one rcu_read_unlock() from
fib_select_default().
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31d40937

12 2月, 2011 1 次提交

ip_gre: Add IPPROTO_GRE to flowi in ipgre_tunnel_xmit · 946bf5ee

由 Steffen Klassert 提交于 2月 11, 2011

Commit 5811662b ("net: use the macros
defined for the members of flowi") accidentally removed the setting of
IPPROTO_GRE from the struct flowi in ipgre_tunnel_xmit. This patch
restores it.
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Acked-by: NChangli Gao <xiaosuo@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

946bf5ee

11 2月, 2011 3 次提交

inet: Create a mechanism for upward inetpeer propagation into routes. · 6431cbc2

由 David S. Miller 提交于 2月 07, 2011

If we didn't have a routing cache, we would not be able to properly
propagate certain kinds of dynamic path attributes, for example
PMTU information and redirects.

The reason is that if we didn't have a routing cache, then there would
be no way to lookup all of the active cached routes hanging off of
sockets, tunnels, IPSEC bundles, etc.

Consider the case where we created a cached route, but no inetpeer
entry existed and also we were not asked to pre-COW the route metrics
and therefore did not force the creation a new inetpeer entry.

If we later get a PMTU message, or a redirect, and store this
information in a new inetpeer entry, there is no way to teach that
cached route about the newly existing inetpeer entry.

The facilities implemented here handle this problem.

First we create a generation ID.  When we create a cached route of any
kind, we remember the generation ID at the time of attachment.  Any
time we force-create an inetpeer entry in response to new path
information, we bump that generation ID.

The dst_ops->check() callback is where the knowledge of this event
is propagated.  If the global generation ID does not equal the one
stored in the cached route, and the cached route has not attached
to an inetpeer yet, we look it up and attach if one is found.  Now
that we've updated the cached route's information, we update the
route's generation ID too.

This clears the way for implementing PMTU and redirects directly in
the inetpeer cache.  There is absolutely no need to consult cached
route information in order to maintain this information.

At this point nothing bumps the inetpeer genids, that comes in the
later changes which handle PMTUs and redirects using inetpeers.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6431cbc2

inetpeer: Add redirect and PMTU discovery cached info. · ddd4aa42

由 David S. Miller 提交于 2月 09, 2011

Validity of the cached PMTU information is indicated by it's
expiration value being non-zero, just as per dst->expires.

The scheme we will use is that we will remember the pre-ICMP value
held in the metrics or route entry, and then at expiration time
we will restore that value.

In this way PMTU expiration does not kill off the cached route as is
done currently.

Redirect information is permanent, or at least until another redirect
is received.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ddd4aa42

inetpeer: Abstract address representation further. · 7a71ed89

由 David S. Miller 提交于 2月 09, 2011

Future changes will add caching information, and some of
these new elements will be addresses.

Since the family is implicit via the ->daddr.family member,
replicating the family in ever address we store is entirely
redundant.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7a71ed89

09 2月, 2011 2 次提交

net: Kill NETEVENT_PMTU_UPDATE. · 8d13a2a9

由 David S. Miller 提交于 2月 08, 2011

Nobody actually does anything in response to the event,
so just kill it off.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d13a2a9

ipsec: allow to align IPv4 AH on 32 bits · fa9921e4

由 Nicolas Dichtel 提交于 2月 02, 2011

The Linux IPv4 AH stack aligns the AH header on a 64 bit boundary
(like in IPv6). This is not RFC compliant (see RFC4302, Section
3.3.3.2.1), it should be aligned on 32 bits.

For most of the authentication algorithms, the ICV size is 96 bits.
The AH header alignment on 32 or 64 bits gives the same results.

However for SHA-256-128 for instance, the wrong 64 bit alignment results
in adding useless padding in IPv4 AH, which is forbidden by the RFC.

To avoid breaking backward compatibility, we use a new flag
(XFRM_STATE_ALIGN4) do change original behavior.

Initial patch from Dang Hongwu <hongwu.dang@6wind.com> and
Christophe Gouault <christophe.gouault@6wind.com>.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa9921e4

05 2月, 2011 2 次提交

inetpeer: Move ICMP rate limiting state into inet_peer entries. · 92d86829

由 David S. Miller 提交于 2月 04, 2011

Like metrics, the ICMP rate limiting bits are cached state about
a destination.  So move it into the inet_peer entries.

If an inet_peer cannot be bound (the reason is memory allocation
failure or similar), the policy is to allow.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92d86829

ipv4: Don't miss existing cached metrics in new routes. · 0131ba45

由 David S. Miller 提交于 2月 04, 2011

Always lookup to see if we have an existing inetpeer entry for
a route.  Let FLOWI_FLAG_PRECOW_METRICS merely influence the
"create" argument to rt_bind_peer().

Also, call rt_bind_peer() unconditionally since it is not
possible for rt->peer to be non-NULL at this point.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0131ba45

04 2月, 2011 2 次提交

D
net: Support compat SIOCGETVIFCNT ioctl in ipv4. · ca6b8bb0
由 David S. Miller 提交于 2月 03, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
ca6b8bb0

net: Fix bug in compat SIOCGETSGCNT handling. · 0033d5ad

由 David S. Miller 提交于 2月 03, 2011

Commit 709b46e8 ("net: Add compat
ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT") added the
correct plumbing to handle SIOCGETSGCNT properly.

However, whilst definiting a proper "struct compat_sioc_sg_req" it
isn't actually used in ipmr_compat_ioctl().

Correct this oversight.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0033d5ad

03 2月, 2011 2 次提交

ipv4: Fix fib_trie build in some configurations. · b299e4f0

由 David S. Miller 提交于 2月 02, 2011

If we end up including include/linux/node.h (either explicitly
or implicitly) that header has a definition of "structt node"
too.

So rename the one we use in fib_trie to "rt_trie_node" to avoid
the conflict.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b299e4f0

D
tcp: Increase the initial congestion window to 10. · 442b9635
由 David S. Miller 提交于 2月 02, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NNandita Dukkipati <nanditad@google.com>
```
442b9635

02 2月, 2011 3 次提交

ipv4: Rename fib_hash_* locals in fib_semantics.c · 123b9731

由 David S. Miller 提交于 2月 01, 2011

To avoid confusion with the recently deleted fib_hash.c
code, use "fib_info_hash_*" instead of plain "fib_hash_*".
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

123b9731

ipv4: Update some fib_hash centric interface names. · 5348ba85

由 David S. Miller 提交于 2月 01, 2011

fib_hash_init() --> fib_trie_init()
fib_hash_table() --> fib_trie_table()
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5348ba85

ipv4: Remove fib_hash. · 3630b7c0

由 David S. Miller 提交于 2月 01, 2011

The time has finally come to remove the hash based routing table
implementation in ipv4.

FIB Trie is mature, well tested, and I've done an audit of it's code
to confirm that it implements insert, delete, and lookup with the same
identical semantics as fib_hash did.

If there are any semantic differences found in fib_trie, we should
simply fix them.

I've placed the trie statistic config option under advanced router
configuration.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>

3630b7c0