1. 01 2月, 2011 1 次提交
    • R
      net: Add default_mtu() methods to blackhole dst_ops · ec831ea7
      Roland Dreier 提交于
      When an IPSEC SA is still being set up, __xfrm_lookup() will return
      -EREMOTE and so ip_route_output_flow() will return a blackhole route.
      This can happen in a sndmsg call, and after d33e4553 ("net: Abstract
      default MTU metric calculation behind an accessor.") this leads to a
      crash in ip_append_data() because the blackhole dst_ops have no
      default_mtu() method and so dst_mtu() calls a NULL pointer.
      
      Fix this by adding default_mtu() methods (that simply return 0, matching
      the old behavior) to the blackhole dst_ops.
      
      The IPv4 part of this patch fixes a crash that I saw when using an IPSEC
      VPN; the IPv6 part is untested because I don't have an IPv6 VPN, but it
      looks to be needed as well.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ec831ea7
  2. 28 1月, 2011 1 次提交
    • D
      ipv6: Remove route peer binding assertions. · 8f2771f2
      David S. Miller 提交于
      They are bogus.  The basic idea is that I wanted to make sure
      that prefixed routes never bind to peers.
      
      The test I used was whether RTF_CACHE was set.
      
      But first of all, the RTF_CACHE flag is set at different spots
      depending upon which ip6_rt_copy() caller you're talking about.
      
      I've validated all of the code paths, and even in the future
      where we bind peers more aggressively (for route metric COW'ing)
      we never bind to prefix'd routes, only fully specified ones.
      This even applies when addrconf or icmp6 routes are allocated.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f2771f2
  3. 27 1月, 2011 1 次提交
  4. 26 1月, 2011 1 次提交
  5. 25 1月, 2011 1 次提交
  6. 19 1月, 2011 1 次提交
    • R
      ipv6: Silence privacy extensions initialization · 2fdc1c80
      Romain Francoise 提交于
      When a network namespace is created (via CLONE_NEWNET), the loopback
      interface is automatically added to the new namespace, triggering a
      printk in ipv6_add_dev() if CONFIG_IPV6_PRIVACY is set.
      
      This is problematic for applications which use CLONE_NEWNET as
      part of a sandbox, like Chromium's suid sandbox or recent versions of
      vsftpd. On a busy machine, it can lead to thousands of useless
      "lo: Disabled Privacy Extensions" messages appearing in dmesg.
      
      It's easy enough to check the status of privacy extensions via the
      use_tempaddr sysctl, so just removing the printk seems like the most
      sensible solution.
      Signed-off-by: NRomain Francoise <romain@orebokech.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2fdc1c80
  7. 13 1月, 2011 2 次提交
  8. 12 1月, 2011 2 次提交
  9. 11 1月, 2011 1 次提交
  10. 20 12月, 2010 1 次提交
  11. 19 12月, 2010 2 次提交
  12. 17 12月, 2010 3 次提交
    • S
      ipv6: don't flush routes when setting loopback down · 29ba5fed
      stephen hemminger 提交于
      When loopback device is being brought down, then keep the route table
      entries because they are special. The entries in the local table for
      linklocal routes and ::1 address should not be purged.
      
      This is a sub optimal solution to the problem and should be replaced
      by a better fix in future.
      Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29ba5fed
    • O
      net: fix nulls list corruptions in sk_prot_alloc · fcbdf09d
      Octavian Purdila 提交于
      Special care is taken inside sk_port_alloc to avoid overwriting
      skc_node/skc_nulls_node. We should also avoid overwriting
      skc_bind_node/skc_portaddr_node.
      
      The patch fixes the following crash:
      
       BUG: unable to handle kernel paging request at fffffffffffffff0
       IP: [<ffffffff812ec6dd>] udp4_lib_lookup2+0xad/0x370
       [<ffffffff812ecc22>] __udp4_lib_lookup+0x282/0x360
       [<ffffffff812ed63e>] __udp4_lib_rcv+0x31e/0x700
       [<ffffffff812bba45>] ? ip_local_deliver_finish+0x65/0x190
       [<ffffffff812bbbf8>] ? ip_local_deliver+0x88/0xa0
       [<ffffffff812eda35>] udp_rcv+0x15/0x20
       [<ffffffff812bba45>] ip_local_deliver_finish+0x65/0x190
       [<ffffffff812bbbf8>] ip_local_deliver+0x88/0xa0
       [<ffffffff812bb2cd>] ip_rcv_finish+0x32d/0x6f0
       [<ffffffff8128c14c>] ? netif_receive_skb+0x99c/0x11c0
       [<ffffffff812bb94b>] ip_rcv+0x2bb/0x350
       [<ffffffff8128c14c>] netif_receive_skb+0x99c/0x11c0
      Signed-off-by: NLeonard Crestez <lcrestez@ixiacom.com>
      Signed-off-by: NOctavian Purdila <opurdila@ixiacom.com>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fcbdf09d
    • A
      ipv6: delete expired route in ip6_pmtu_deliver · d3052b55
      Andrey Vagin 提交于
      The first big packets sent to a "low-MTU" client correctly
      triggers the creation of a temporary route containing the reduced MTU.
      
      But after the temporary route has expired, new ICMP6 "packet too big"
      will be sent, rt6_pmtu_discovery will find the previous EXPIRED route
      check that its mtu isn't bigger then in icmp packet and do nothing
      before the temporary route will not deleted by gc.
      
      I make the simple experiment:
      while :; do
          time ( dd if=/dev/zero bs=10K count=1 | ssh hostname dd of=/dev/null ) || break;
      done
      
      The "time" reports real 0m0.197s if a temporary route isn't expired, but
      it reports real 0m52.837s (!!!!) immediately after a temporare route has
      expired.
      Signed-off-by: NAndrey Vagin <avagin@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3052b55
  13. 15 12月, 2010 1 次提交
  14. 14 12月, 2010 1 次提交
    • D
      net: Abstract default ADVMSS behind an accessor. · 0dbaee3b
      David S. Miller 提交于
      Make all RTAX_ADVMSS metric accesses go through a new helper function,
      dst_metric_advmss().
      
      Leave the actual default metric as "zero" in the real metric slot,
      and compute the actual default value dynamically via a new dst_ops
      AF specific callback.
      
      For stacked IPSEC routes, we use the advmss of the path which
      preserves existing behavior.
      
      Unlike ipv4/ipv6, DecNET ties the advmss to the mtu and thus updates
      advmss on pmtu updates.  This inconsistency in advmss handling
      results in more raw metric accesses than I wish we ended up with.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0dbaee3b
  15. 13 12月, 2010 3 次提交
  16. 11 12月, 2010 5 次提交
  17. 10 12月, 2010 3 次提交
    • U
      a34f0b31
    • E
      net: optimize INET input path further · 68835aba
      Eric Dumazet 提交于
      Followup of commit b178bb3d (net: reorder struct sock fields)
      
      Optimize INET input path a bit further, by :
      
      1) moving sk_refcnt close to sk_lock.
      
      This reduces number of dirtied cache lines by one on 64bit arches (and
      64 bytes cache line size).
      
      2) moving inet_daddr & inet_rcv_saddr at the beginning of sk
      
      (same cache line than hash / family / bound_dev_if / nulls_node)
      
      This reduces number of accessed cache lines in lookups by one, and dont
      increase size of inet and timewait socks.
      inet and tw sockets now share same place-holder for these fields.
      
      Before patch :
      
      offsetof(struct sock, sk_refcnt) = 0x10
      offsetof(struct sock, sk_lock) = 0x40
      offsetof(struct sock, sk_receive_queue) = 0x60
      offsetof(struct inet_sock, inet_daddr) = 0x270
      offsetof(struct inet_sock, inet_rcv_saddr) = 0x274
      
      After patch :
      
      offsetof(struct sock, sk_refcnt) = 0x44
      offsetof(struct sock, sk_lock) = 0x48
      offsetof(struct sock, sk_receive_queue) = 0x68
      offsetof(struct inet_sock, inet_daddr) = 0x0
      offsetof(struct inet_sock, inet_rcv_saddr) = 0x4
      
      compute_score() (udp or tcp) now use a single cache line per ignored
      item, instead of two.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68835aba
    • D
      net: Abstract away all dst_entry metrics accesses. · defb3519
      David S. Miller 提交于
      Use helper functions to hide all direct accesses, especially writes,
      to dst_entry metrics values.
      
      This will allow us to:
      
      1) More easily change how the metrics are stored.
      
      2) Implement COW for metrics.
      
      In particular this will help us put metrics into the inetpeer
      cache if that is what we end up doing.  We can make the _metrics
      member a pointer instead of an array, initially have it point
      at the read-only metrics in the FIB, and then on the first set
      grab an inetpeer entry and point the _metrics member there.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      defb3519
  18. 03 12月, 2010 4 次提交
  19. 02 12月, 2010 3 次提交
  20. 01 12月, 2010 2 次提交
  21. 29 11月, 2010 1 次提交