1. 02 8月, 2013 1 次提交
    • M
      ipv6: prevent fib6_run_gc() contention · 2ac3ac8f
      Michal Kubeček 提交于
      On a high-traffic router with many processors and many IPv6 dst
      entries, soft lockup in fib6_run_gc() can occur when number of
      entries reaches gc_thresh.
      
      This happens because fib6_run_gc() uses fib6_gc_lock to allow
      only one thread to run the garbage collector but ip6_dst_gc()
      doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
      returns. On a system with many entries, this can take some time
      so that in the meantime, other threads pass the tests in
      ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
      the lock. They then have to run the garbage collector one after
      another which blocks them for quite long.
      
      Resolve this by replacing special value ~0UL of expire parameter
      to fib6_run_gc() by explicit "force" parameter to choose between
      spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
      force=false if gc_thresh is reached but not max_size.
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ac3ac8f
  2. 21 2月, 2013 1 次提交
  3. 18 1月, 2013 1 次提交
  4. 09 11月, 2012 1 次提交
  5. 04 11月, 2012 1 次提交
  6. 23 10月, 2012 1 次提交
  7. 19 9月, 2012 1 次提交
  8. 06 9月, 2012 1 次提交
  9. 05 7月, 2012 1 次提交
  10. 16 6月, 2012 2 次提交
  11. 11 6月, 2012 2 次提交
  12. 18 4月, 2012 2 次提交
    • J
      ipv6: clean up rt6_clean_expires · cda31e10
      Jiri Bohac 提交于
      Functionally, this change is a NOP.
      
      Semantically, rt6_clean_expires() wants to do rt->dst.from = NULL instead of
      rt->dst.expires = 0. It is clearing the RTF_EXPIRES flag, so the union is going
      to be treated as a pointer (dst.from) not a long (dst.expires).
      Signed-off-by: NJiri Bohac <jbohac@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cda31e10
    • J
      ipv6: fix rt6_update_expires · edfb5d46
      Jiri Bohac 提交于
      Commit 1716a961 (ipv6: fix problem with expired dst cache) broke PMTU
      discovery. rt6_update_expires() calls dst_set_expires(), which only updates
      dst->expires if it has not been set previously (expires == 0) or if the new
      expires is earlier than the current dst->expires.
      
      rt6_update_expires() needs to zero rt->dst.expires, otherwise it will contain
      ivalid data left over from rt->dst.from and will confuse dst_set_expires().
      Signed-off-by: NJiri Bohac <jbohac@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      edfb5d46
  13. 14 4月, 2012 1 次提交
    • G
      ipv6: fix problem with expired dst cache · 1716a961
      Gao feng 提交于
      If the ipv6 dst cache which copy from the dst generated by ICMPV6 RA packet.
      this dst cache will not check expire because it has no RTF_EXPIRES flag.
      So this dst cache will always be used until the dst gc run.
      
      Change the struct dst_entry,add a union contains new pointer from and expires.
      When rt6_info.rt6i_flags has no RTF_EXPIRES flag,the dst.expires has no use.
      we can use this field to point to where the dst cache copy from.
      The dst.from is only used in IPV6.
      
      rt6_check_expired check if rt6_info.dst.from is expired.
      
      ip6_rt_copy only set dst.from when the ort has flag RTF_ADDRCONF
      and RTF_DEFAULT.then hold the ort.
      
      ip6_dst_destroy release the ort.
      
      Add some functions to operate the RTF_EXPIRES flag and expires(from) together.
      and change the code to use these new adding functions.
      
      Changes from v5:
      modify ip6_route_add and ndisc_router_discovery to use new adding functions.
      
      Only set dst.from when the ort has flag RTF_ADDRCONF
      and RTF_DEFAULT.then hold the ort.
      Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1716a961
  14. 31 12月, 2011 1 次提交
    • J
      IPv6: Avoid taking write lock for /proc/net/ipv6_route · 32b293a5
      Josh Hunt 提交于
      During some debugging I needed to look into how /proc/net/ipv6_route
      operated and in my digging I found its calling fib6_clean_all() which uses
      "write_lock_bh(&table->tb6_lock)" before doing the walk of the table. I
      found this on 2.6.32, but reading the code I believe the same basic idea
      exists currently. Looking at the rtnetlink code they are only calling
      "read_lock_bh(&table->tb6_lock);" via fib6_dump_table(). While I realize
      reading from proc isn't the recommended way of fetching the ipv6 route
      table; taking a write lock seems unnecessary and would probably cause
      network performance issues.
      
      To verify this I loaded up the ipv6 route table and then ran iperf in 3
      cases:
        * doing nothing
        * reading ipv6 route table via proc
          (while :; do cat /proc/net/ipv6_route > /dev/null; done)
        * reading ipv6 route table via rtnetlink
          (while :; do ip -6 route show table all > /dev/null; done)
      
      * Load the ipv6 route table up with:
        * for ((i = 0;i < 4000;i++)); do ip route add unreachable 2000::$i; done
      
      * iperf commands:
        * client: iperf -i 1 -V -c <ipv6 addr>
        * server: iperf -V -s
      
      * iperf results - 3 runs each (in Mbits/sec)
        * nothing: client: 927,927,927 server: 927,927,927
        * proc: client: 179,97,96,113 server: 142,112,133
        * iproute: client: 928,927,928 server: 927,927,927
      
      lock_stat shows taking the write lock is causing the slowdown. Using this
      info I decided to write a version of fib6_clean_all() which replaces
      write_lock_bh(&table->tb6_lock) with read_lock_bh(&table->tb6_lock). With
      this new function I see the same results as with my rtnetlink iperf test.
      Signed-off-by: NJosh Hunt <joshhunt00@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32b293a5
  15. 29 12月, 2011 1 次提交
  16. 18 7月, 2011 1 次提交
  17. 25 4月, 2011 1 次提交
  18. 23 4月, 2011 1 次提交
  19. 16 4月, 2011 1 次提交
  20. 13 3月, 2011 1 次提交
  21. 11 2月, 2011 1 次提交
    • D
      inet: Create a mechanism for upward inetpeer propagation into routes. · 6431cbc2
      David S. Miller 提交于
      If we didn't have a routing cache, we would not be able to properly
      propagate certain kinds of dynamic path attributes, for example
      PMTU information and redirects.
      
      The reason is that if we didn't have a routing cache, then there would
      be no way to lookup all of the active cached routes hanging off of
      sockets, tunnels, IPSEC bundles, etc.
      
      Consider the case where we created a cached route, but no inetpeer
      entry existed and also we were not asked to pre-COW the route metrics
      and therefore did not force the creation a new inetpeer entry.
      
      If we later get a PMTU message, or a redirect, and store this
      information in a new inetpeer entry, there is no way to teach that
      cached route about the newly existing inetpeer entry.
      
      The facilities implemented here handle this problem.
      
      First we create a generation ID.  When we create a cached route of any
      kind, we remember the generation ID at the time of attachment.  Any
      time we force-create an inetpeer entry in response to new path
      information, we bump that generation ID.
      
      The dst_ops->check() callback is where the knowledge of this event
      is propagated.  If the global generation ID does not equal the one
      stored in the cached route, and the cached route has not attached
      to an inetpeer yet, we look it up and attach if one is found.  Now
      that we've updated the cached route's information, we update the
      route's generation ID too.
      
      This clears the way for implementing PMTU and redirects directly in
      the inetpeer cache.  There is absolutely no need to consult cached
      route information in order to maintain this information.
      
      At this point nothing bumps the inetpeer genids, that comes in the
      later changes which handle PMTUs and redirects using inetpeers.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6431cbc2
  22. 01 12月, 2010 1 次提交
  23. 11 6月, 2010 1 次提交
  24. 02 4月, 2010 1 次提交
  25. 19 2月, 2010 1 次提交
  26. 13 2月, 2010 1 次提交
    • P
      ipv6: fib: fix crash when changing large fib while dumping it · 2bec5a36
      Patrick McHardy 提交于
      When the fib size exceeds what can be dumped in a single skb, the
      dump is suspended and resumed once the last skb has been received
      by userspace. When the fib is changed while the dump is suspended,
      the walker might contain stale pointers, causing a crash when the
      dump is resumed.
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      IP: [<ffffffffa01bce04>] fib6_walk_continue+0xbb/0x124 [ipv6]
      PGD 5347a067 PUD 65c7067 PMD 0
      Oops: 0000 [#1] PREEMPT SMP
      ...
      RIP: 0010:[<ffffffffa01bce04>]
      [<ffffffffa01bce04>] fib6_walk_continue+0xbb/0x124 [ipv6]
      ...
      Call Trace:
       [<ffffffff8104aca3>] ? mutex_spin_on_owner+0x59/0x71
       [<ffffffffa01bd105>] inet6_dump_fib+0x11b/0x1b9 [ipv6]
       [<ffffffff81371af4>] netlink_dump+0x5b/0x19e
       [<ffffffff8134f288>] ? consume_skb+0x28/0x2a
       [<ffffffff81373b69>] netlink_recvmsg+0x1ab/0x2c6
       [<ffffffff81372781>] ? netlink_unicast+0xfa/0x151
       [<ffffffff813483e0>] __sock_recvmsg+0x6d/0x79
       [<ffffffff81348a53>] sock_recvmsg+0xca/0xe3
       [<ffffffff81066d4b>] ? autoremove_wake_function+0x0/0x38
       [<ffffffff811ed1f8>] ? radix_tree_lookup_slot+0xe/0x10
       [<ffffffff810b3ed7>] ? find_get_page+0x90/0xa5
       [<ffffffff810b5dc5>] ? filemap_fault+0x201/0x34f
       [<ffffffff810ef152>] ? fget_light+0x2f/0xac
       [<ffffffff813519e7>] ? verify_iovec+0x4f/0x94
       [<ffffffff81349a65>] sys_recvmsg+0x14d/0x223
      
      Store the serial number when beginning to walk the fib and reload
      pointers when continuing to walk after a change occured. Similar
      to other dumping functions, this might cause unrelated entries to
      be missed when entries are deleted.
      Tested-by: NBen Greear <greearb@candelatech.com>
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2bec5a36
  27. 04 11月, 2009 1 次提交
  28. 31 7月, 2009 1 次提交
    • N
      xfrm: select sane defaults for xfrm[4|6] gc_thresh · a33bc5c1
      Neil Horman 提交于
      Choose saner defaults for xfrm[4|6] gc_thresh values on init
      
      Currently, the xfrm[4|6] code has hard-coded initial gc_thresh values
      (set to 1024).  Given that the ipv4 and ipv6 routing caches are sized
      dynamically at boot time, the static selections can be non-sensical.
      This patch dynamically selects an appropriate gc threshold based on
      the corresponding main routing table size, using the assumption that
      we should in the worst case be able to handle as many connections as
      the routing table can.
      
      For ipv4, the maximum route cache size is 16 * the number of hash
      buckets in the route cache.  Given that xfrm4 starts garbage
      collection at the gc_thresh and prevents new allocations at 2 *
      gc_thresh, we set gc_thresh to half the maximum route cache size.
      
      For ipv6, its a bit trickier.  there is no maximum route cache size,
      but the ipv6 dst_ops gc_thresh is statically set to 1024.  It seems
      sane to select a simmilar gc_thresh for the xfrm6 code that is half
      the number of hash buckets in the v6 route cache times 16 (like the v4
      code does).
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a33bc5c1
  29. 05 3月, 2008 1 次提交
  30. 04 3月, 2008 3 次提交
  31. 08 2月, 2008 1 次提交
  32. 29 1月, 2008 4 次提交
新手
引导
客服 返回
顶部