提交 · 60d354ebebd9d0f760cb6c3b9f53a7ade0f8cd0e · openeuler / raspberrypi-kernel

05 7月, 2012 1 次提交

ipv4: Make neigh lookups directly in output packet path. · a263b309

由 David S. Miller 提交于 7月 02, 2012

Do not use the dst cached neigh, we'll be getting rid of that.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a263b309

29 6月, 2012 5 次提交

ipv4: Elide fib_validate_source() completely when possible. · 7a9bc9b8

由 David S. Miller 提交于 6月 29, 2012

If rpfilter is off (or the SKB has an IPSEC path) and there are not
tclassid users, we don't have to do anything at all when
fib_validate_source() is invoked besides setting the itag to zero.

We monitor tclassid uses with a counter (modified only under RTNL and
marked __read_mostly) and we protect the fib_validate_source() real
work with a test against this counter and whether rpfilter is to be
done.

Having a way to know whether we need no tclassid processing or not
also opens the door for future optimized rpfilter algorithms that do
not perform full FIB lookups.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7a9bc9b8

ipv6_tunnel: Allow receiving packets on the fallback tunnel if they pass sanity checks · d0087b29

由 Ville Nuorvala 提交于 6月 28, 2012

At Facebook, we do Layer-3 DSR via IP-in-IP tunneling. Our load balancers wrap
an extra IP header on incoming packets so they can be routed to the backend.
In the v4 tunnel driver, when these packets fall on the default tunl0 device,
the behavior is to decapsulate them and drop them back on the stack. So our
setup is that tunl0 has the VIP and eth0 has (obviously) the backend's real
address.

In IPv6 we do the same thing, but the v6 tunnel driver didn't have this same
behavior - if you didn't have an explicit tunnel setup, it would drop the
packet.

This patch brings that v4 feature to the v6 driver.

The same IPv6 address checks are performed as with any normal tunnel,
but as the fallback tunnel endpoint addresses are unspecified, the checks
must be performed on a per-packet basis, rather than at tunnel
configuration time.

[Patch description modified by phil@ipom.com]
Signed-off-by: NVille Nuorvala <ville.nuorvala@gmail.com>
Tested-by: NPhil Dibowitz <phil@ipom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d0087b29

ipv4: Adjust in_dev handling in fib_validate_source() · 9e56e380

由 David S. Miller 提交于 6月 28, 2012

Checking for in_dev being NULL is pointless.

In fact, all of our callers have in_dev precomputed already,
so just pass it in and remove the NULL checking.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e56e380

net: Use NLMSG_DEFAULT_SIZE in combination with nlmsg_new() · 58050fce

由 Thomas Graf 提交于 6月 28, 2012

Using NLMSG_GOODSIZE results in multiple pages being used as
nlmsg_new() will automatically add the size of the netlink
header to the payload thus exceeding the page limit.

NLMSG_DEFAULT_SIZE takes this into account.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Cc: Jiri Pirko <jpirko@redhat.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Sergey Lapin <slapin@ossfans.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Reviewed-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

58050fce

tcp: pass fl6 to inet6_csk_route_req() · 3840a06e

由 Neal Cardwell 提交于 6月 28, 2012

This commit changes inet_csk_route_req() so that it uses a pointer to
a struct flowi6, rather than allocating its own on the stack. This
brings its behavior in line with its IPv4 cousin,
inet_csk_route_req(), and allows a follow-on patch to fix a dst leak.
Signed-off-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3840a06e

28 6月, 2012 7 次提交

D
ipv4: Kill rt->rt_spec_dst, no longer used. · 41347dcd
由 David S. Miller 提交于 6月 28, 2012
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
41347dcd

ipv4: Create and use fib_compute_spec_dst() helper. · 35ebf65e

由 David S. Miller 提交于 6月 28, 2012

The specific destination is the host we direct unicast replies to.
Usually this is the original packet source address, but if we are
responding to a multicast or broadcast packet we have to use something
different.

Specifically we must use the source address we would use if we were to
send a packet to the unicast source of the original packet.

The routing cache precomputes this value, but we want to remove that
precomputation because it creates a hard dependency on the expensive
rpfilter source address validation which we'd like to make cheaper.

There are only three places where this matters:

1) ICMP replies.

2) pktinfo CMSG

3) IP options

Now there will be no real users of rt->rt_spec_dst and we can simply
remove it altogether.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

35ebf65e

ipv4: Show that ip_send_reply() is purely unicast routine. · 70e73416

由 David S. Miller 提交于 6月 28, 2012

Rename it to ip_send_unicast_reply() and add explicit 'saddr'
argument.

This removed one of the few users of rt->rt_spec_dst.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70e73416

D
ipv4: Kill early demux method return value. · 160eb5a6
由 David S. Miller 提交于 6月 27, 2012
```
It's completely unnecessary.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
160eb5a6

xfrm_user: Propagate netlink error codes properly. · 1d1e34dd

由 David S. Miller 提交于 6月 27, 2012

Instead of using a fixed value of "-1" or "-EMSGSIZE", propagate what
the nla_*() interfaces actually return.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d1e34dd

Revert "ipv4: tcp: dont cache unconfirmed intput dst" · c10237e0

由 David S. Miller 提交于 6月 27, 2012

This reverts commit c074da28.

This change has several unwanted side effects:

1) Sockets will cache the DST_NOCACHE route in sk->sk_rx_dst and we'll
   thus never create a real cached route.

2) All TCP traffic will use DST_NOCACHE and never use the routing
   cache at all.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c10237e0

ipv4: tcp: dont cache unconfirmed intput dst · c074da28

由 Eric Dumazet 提交于 6月 26, 2012

DDOS synflood attacks hit badly IP route cache.

On typical machines, this cache is allowed to hold up to 8 Millions dst
entries, 256 bytes for each, for a total of 2GB of memory.

rt_garbage_collect() triggers and tries to cleanup things.

Eventually route cache is disabled but machine is under fire and might
OOM and crash.

This patch exploits the new TCP early demux, to set a nocache
boolean in case incoming TCP frame is for a not yet ESTABLISHED or
TIMEWAIT socket.

This 'nocache' boolean is then used in case dst entry is not found in
route cache, to create an unhashed dst entry (DST_NOCACHE)

SYN-cookie-ACK sent use a similar mechanism (ipv4: tcp: dont cache
output dst for syncookies), so after this patch, a machine is able to
absorb a DDOS synflood attack without polluting its IP route cache.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c074da28

27 6月, 2012 1 次提交

mac802154: add wpan device-class support · 32bad7e3

由 alex.bluesman.smirnov@gmail.com 提交于 6月 25, 2012

Every real 802.15.4 transceiver, which works with software MAC layer,
can be classified as a wpan device in this stack. So the wpan device
implementation provides missing link in datapath between the device
drivers and the Linux network queue.

According to the IEEE 802.15.4 standard each packet can be one of the
following types:
 - beacon
 - MAC layer command
 - ACK
 - data

This patch adds support for the data packet-type only, but this is
enough to perform data transmission and receiving over radio.
Signed-off-by: NAlexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32bad7e3

26 6月, 2012 6 次提交

nl80211: specify RSSI threshold in scheduled scan · 88e920b4

由 Thomas Pedersen 提交于 6月 21, 2012

Support configuring an RSSI threshold in dBm (s32) when requesting
scheduled scan, below which a BSS won't be reported by the cfg80211
driver.
Signed-off-by: NThomas Pedersen <c_tpeder@qca.qualcomm.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

88e920b4

caif-hsi: Remove use of module parameters · 91fa0cbc

由 Sjur Brændeland 提交于 6月 25, 2012

Remove use of module parameters on caif hsi device, as
rtnl configuration parameters are already supported.

All caif hsi configuration data is put in cfhsi_config,
and default values in hsi_default_config.
Signed-off-by: NSjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

91fa0cbc

caif-hsi: Replace platform device with ops structure. · 1c385f1f

由 Sjur Brændeland 提交于 6月 25, 2012

Remove use of struct platform_device, and replace it with
struct cfhsi_ops. Updated variable names in the same
spirit:
cfhsi_get_dev to cfhsi_get_ops,
cfhsi->dev to cfhsi->ops and,
cfhsi->dev.drv to cfhsi->ops->cb_ops.
Signed-off-by: NSjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1c385f1f

caif-hsi: Add rtnl support · c4125400

由 Sjur Brændeland 提交于 6月 25, 2012

Add RTNL support for managing the caif hsi interface.
The HSI HW interface is no longer registering as a device,
instead we use symbol_get to get hold of the HSI API.
Signed-off-by: NSjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c4125400

net: struct sock cleanups · deaa5854

由 Eric Dumazet 提交于 6月 24, 2012

Add missing kernel doc for sk_rx_dst

Move sk_rx_dst to avoid two 32bit holes on 64bit arches
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

deaa5854

net: Remove 'unlikely' qualifier in skb_steal_sock() · efc27f8c

由 Vijay Subramanian 提交于 6月 24, 2012

With early demux enabled by default for TCP flows, there is high chance that
skb->sk will be non-null. 'unlikely()' was removed from __inet_lookup_skb() but
maybe it can be removed from skb_steal_sock() as well.

Note: skb_steal_sock() is also called by __inet6_lookup_skb() and
__udp4_lib_lookup_skb() but they are protected by their own 'unlikely' calls.
Signed-off-by: NVijay Subramanian <subramanian.vijay@gmail.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

efc27f8c

24 6月, 2012 1 次提交

mac80211: clean up debugging · bdcbd8e0

由 Johannes Berg 提交于 6月 22, 2012

There are a few things that make the logging and
debugging in mac80211 less useful than it should
be right now:
 * a lot of messages should be pr_info, not pr_debug
 * wholesale use of pr_debug makes it require *both*
   Kconfig and dynamic configuration
 * there are still a lot of ifdefs
 * the style is very inconsistent, sometimes the
   sdata->name is printed in front

Clean up everything, introducing new macros and
separating out the station MLME debugging into
a new Kconfig symbol.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

bdcbd8e0

23 6月, 2012 2 次提交

ipv4: tcp: dont cache output dst for syncookies · 7586eceb

由 Eric Dumazet 提交于 6月 20, 2012

Don't cache output dst for syncookies, as this adds pressure on IP route
cache and rcu subsystem for no gain.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7586eceb

ipv4: Add sysctl knob to control early socket demux · 6648bd7e

由 Alexander Duyck 提交于 6月 21, 2012

This change is meant to add a control for disabling early socket demux.
The main motivation behind this patch is to provide an option to disable
the feature as it adds an additional cost to routing that reduces overall
throughput by up to 5%.  For example one of my systems went from 12.1Mpps
to 11.6 after the early socket demux was added.  It looks like the reason
for the regression is that we are now having to perform two lookups, first
the one for an established socket, and then the one for the routing table.

By adding this patch and toggling the value for ip_early_demux to 0 I am
able to get back to the 12.1Mpps I was previously seeing.

[ Move local variables in ip_rcv_finish() down into the basic
  block in which they are actually used.  -DaveM ]
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6648bd7e

21 6月, 2012 1 次提交

mac80211: add command to get current rssi · 66572cfc

由 Victor Goldenshtein 提交于 6月 21, 2012

Get current rssi (in dBm) from the driver/FW.

Instead of reporting the signal received in the last
rx packet, which might be inaccurate if rx traffic is
low and beacon filtering is enabled, get the signal
from the driver/FW.
Signed-off-by: NVictor Goldenshtein <victorg@ti.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

66572cfc

20 6月, 2012 2 次提交

ipv4: Early TCP socket demux. · 41063e9d

由 David S. Miller 提交于 6月 19, 2012

Input packet processing for local sockets involves two major demuxes.
One for the route and one for the socket.

But we can optimize this down to one demux for certain kinds of local
sockets.

Currently we only do this for established TCP sockets, but it could
at least in theory be expanded to other kinds of connections.

If a TCP socket is established then it's identity is fully specified.

This means that whatever input route was used during the three-way
handshake must work equally well for the rest of the connection since
the keys will not change.

Once we move to established state, we cache the receive packet's input
route to use later.

Like the existing cached route in sk->sk_dst_cache used for output
packets, we have to check for route invalidations using dst->obsolete
and dst->ops->check().

Early demux occurs outside of a socket locked section, so when a route
invalidation occurs we defer the fixup of sk->sk_rx_dst until we are
actually inside of established state packet processing and thus have
the socket locked.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

41063e9d

inet: Sanitize inet{,6} protocol demux. · f9242b6b

由 David S. Miller 提交于 6月 19, 2012

Don't pretend that inet_protos[] and inet6_protos[] are hashes, thay
are just a straight arrays.  Remove all unnecessary hash masking.

Document MAX_INET_PROTOS.

Use RAW_HTABLE_SIZE when appropriate.
Reported-by: NBen Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f9242b6b

19 6月, 2012 2 次提交

netfilter: fix missing symbols if CONFIG_NETFILTER_NETLINK_QUEUE_CT unset · 674147e2

由 Pablo Neira Ayuso 提交于 6月 19, 2012

ERROR: "nfqnl_ct_parse" [net/netfilter/nfnetlink_queue.ko] undefined!
ERROR: "nfqnl_ct_seq_adjust" [net/netfilter/nfnetlink_queue.ko] undefined!
ERROR: "nfqnl_ct_put" [net/netfilter/nfnetlink_queue.ko] undefined!
ERROR: "nfqnl_ct_get" [net/netfilter/nfnetlink_queue.ko] undefined!

We have to use CONFIG_NETFILTER_NETLINK_QUEUE_CT in
include/net/netfilter/nfnetlink_queue.h, not CONFIG_NF_CONNTRACK.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

674147e2

netfilter: nfnetlink_queue: fix compilation with NF_CONNTRACK disabled · 7c622345

由 Pablo Neira Ayuso 提交于 6月 19, 2012

In "9cb01766 netfilter: add glue code to integrate nfnetlink_queue and ctnetlink"
the compilation with NF_CONNTRACK disabled is broken. This patch fixes this
issue.

I have moved the conntrack part into nfnetlink_queue_ct.c to avoid
peppering the entire nfnetlink_queue.c code with ifdefs.

I also needed to rename nfnetlink_queue.c to nfnetlink_queue_pkt.c
to update the net/netfilter/Makefile to support conditional compilation
of the conntrack integration.

This patch also adds CONFIG_NETFILTER_QUEUE_CT in case you want to explicitly
disable the integration between nf_conntrack and nfnetlink_queue.
Reported-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

7c622345

18 6月, 2012 2 次提交

{nl,cfg,mac}80211: implement dot11MeshHWMPconfirmationInterval · 728b19e5

由 Chun-Yeow Yeoh 提交于 6月 14, 2012

As defined in section 13.10.9.3 Case D (802.11-2012), this
control variable is used to limit the mesh STA to send only
one PREQ to a root mesh STA within this interval of time
(in TUs). The default value for this variable is set to
2000 TUs. However, for current implementation, the maximum
configurable of dot11MeshHWMPconfirmationInterval is
restricted by dot11MeshHWMPactivePathTimeout.
Signed-off-by: NChun-Yeow Yeoh <yeohchunyeow@gmail.com>
[line-break commit log]
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

728b19e5

net: remove my future former mail address · 31fdc555

由 Rémi Denis-Courmont 提交于 6月 13, 2012

Signed-off-by: NRémi Denis-Courmont <remi@remlab.net>
Cc: Sakari Ailus <sakari.ailus@nokia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31fdc555

17 6月, 2012 1 次提交

include/net/dst.h: neaten asterisk placement · 7f95e188

由 Eldad Zack 提交于 6月 16, 2012

Fix code style - place the asterisk where it belongs.
Signed-off-by: NEldad Zack <eldad@fogrefinery.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f95e188

16 6月, 2012 9 次提交

netfilter: add user-space connection tracking helper infrastructure · 12f7a505

由 Pablo Neira Ayuso 提交于 5月 13, 2012

There are good reasons to supports helpers in user-space instead:

* Rapid connection tracking helper development, as developing code
  in user-space is usually faster.

* Reliability: A buggy helper does not crash the kernel. Moreover,
  we can monitor the helper process and restart it in case of problems.

* Security: Avoid complex string matching and mangling in kernel-space
  running in privileged mode. Going further, we can even think about
  running user-space helpers as a non-root process.

* Extensibility: It allows the development of very specific helpers (most
  likely non-standard proprietary protocols) that are very likely not to be
  accepted for mainline inclusion in the form of kernel-space connection
  tracking helpers.

This patch adds the infrastructure to allow the implementation of
user-space conntrack helpers by means of the new nfnetlink subsystem
`nfnetlink_cthelper' and the existing queueing infrastructure
(nfnetlink_queue).

I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register
ipv[4|6]_helper which results from splitting ipv[4|6]_confirm into
two pieces. This change is required not to break NAT sequence
adjustment and conntrack confirmation for traffic that is enqueued
to our user-space conntrack helpers.

Basic operation, in a few steps:

1) Register user-space helper by means of `nfct':

 nfct helper add ftp inet tcp

 [ It must be a valid existing helper supported by conntrack-tools ]

2) Add rules to enable the FTP user-space helper which is
   used to track traffic going to TCP port 21.

For locally generated packets:

 iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp

For non-locally generated packets:

 iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp

3) Run the test conntrackd in helper mode (see example files under
   doc/helper/conntrackd.conf

 conntrackd

4) Generate FTP traffic going, if everything is OK, then conntrackd
   should create expectations (you can check that with `conntrack':

 conntrack -E expect

    [NEW] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp
[DESTROY] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp

This confirms that our test helper is receiving packets including the
conntrack information, and adding expectations in kernel-space.

The user-space helper can also store its private tracking information
in the conntrack structure in the kernel via the CTA_HELP_INFO. The
kernel will consider this a binary blob whose layout is unknown. This
information will be included in the information that is transfered
to user-space via glue code that integrates nfnetlink_queue and
ctnetlink.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

12f7a505

netfilter: ctnetlink: add CTA_HELP_INFO attribute · ae243bee

由 Pablo Neira Ayuso 提交于 6月 07, 2012

This attribute can be used to modify and to dump the internal
protocol information.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

ae243bee

netfilter: nfnetlink_queue: add NAT TCP sequence adjustment if packet mangled · 8c88f87c

由 Pablo Neira Ayuso 提交于 6月 07, 2012

User-space programs that receive traffic via NFQUEUE may mangle packets.
If NAT is enabled, this usually puzzles sequence tracking, leading to
traffic disruptions.

With this patch, nfnl_queue will make the corresponding NAT TCP sequence
adjustment if:

1) The packet has been mangled,
2) the NFQA_CFG_F_CONNTRACK flag has been set, and
3) NAT is detected.

There are some records on the Internet complaning about this issue:
http://stackoverflow.com/questions/260757/packet-mangling-utilities-besides-iptables

By now, we only support TCP since we have no helpers for DCCP or SCTP.
Better to add this if we ever have some helper over those layer 4 protocols.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

8c88f87c

netfilter: nf_ct_helper: implement variable length helper private data · 1afc5679

由 Pablo Neira Ayuso 提交于 6月 07, 2012

This patch uses the new variable length conntrack extensions.

Instead of using union nf_conntrack_help that contain all the
helper private data information, we allocate variable length
area to store the private helper data.

This patch includes the modification of all existing helpers.
It also includes a couple of include header to avoid compilation
warnings.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

1afc5679

netfilter: nf_ct_ext: support variable length extensions · 3cf4c7e3

由 Pablo Neira Ayuso 提交于 2月 01, 2012

We can now define conntrack extensions of variable size. This
patch is useful to get rid of these unions:

union nf_conntrack_help
union nf_conntrack_proto
union nf_conntrack_nat_help
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

3cf4c7e3

netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names · 3a8fc53a

由 Pablo Neira Ayuso 提交于 1月 15, 2012

This patch modifies the struct nf_conntrack_helper to allocate
the room for the helper name. The maximum length is 16 bytes
(this was already introduced in 2.6.24).

For the maximum length for expectation policy names, I have
also selected 16 bytes.

This patch is required by the follow-up patch to support
user-space connection tracking helpers.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

3a8fc53a

Revert "ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route" · e8803b6c

由 David S. Miller 提交于 6月 16, 2012

This reverts commit 2a0c451a.

It causes crashes, because now ip6_null_entry is used before
it is initialized.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e8803b6c

ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route · 2a0c451a

由 Thomas Graf 提交于 6月 14, 2012

/proc/net/ipv6_route reflects the contents of fib_table_hash. The proc
handler is installed in ip6_route_net_init() whereas fib_table_hash is
allocated in fib6_net_init() _after_ the proc handler has been installed.

This opens up a short time frame to access fib_table_hash with its pants
down.

fib6_init() as a whole can't be moved to an earlier position as it also
registers the rtnetlink message handlers which should be registered at
the end. Therefore split it into fib6_init() which is run early and
fib6_init_late() to register the rtnetlink message handlers.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Reviewed-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2a0c451a

ipv6: Handle PMTU in ICMP error handlers. · 81aded24

由 David S. Miller 提交于 6月 15, 2012

One tricky issue on the ipv6 side vs. ipv4 is that the ICMP callouts
to handle the error pass the 32-bit info cookie in network byte order
whereas ipv4 passes it around in host byte order.

Like the ipv4 side, we have two helper functions.  One for when we
have a socket context and one for when we do not.

ip6ip6 tunnels are not handled here, because they handle PMTU events
by essentially relaying another ICMP packet-too-big message back to
the original sender.

This patch allows us to get rid of rt6_do_pmtu_disc().  It handles all
kinds of situations that simply cannot happen when we do the PMTU
update directly using a fully resolved route.

In fact, the "plen == 128" check in ip6_rt_update_pmtu() can very
likely be removed or changed into a BUG_ON() check.  We should never
have a prefixed ipv6 route when we get there.

Another piece of strange history here is that TCP and DCCP, unlike in
ipv4, never invoke the update_pmtu() method from their ICMP error
handlers.  This is incredibly astonishing since this is the context
where we have the most accurate context in which to make a PMTU
update, namely we have a fully connected socket and associated cached
socket route.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

81aded24