1. 10 9月, 2015 2 次提交
    • W
      ipv6: fix ifnullfree.cocci warnings · 52fe51f8
      Wu Fengguang 提交于
      net/ipv6/route.c:2946:3-8: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values.
      
       NULL check before some freeing functions is not needed.
      
       Based on checkpatch warning
       "kfree(NULL) is safe this check is probably not required"
       and kfreeaddr.cocci by Julia Lawall.
      
      Generated by: scripts/coccinelle/free/ifnullfree.cocci
      
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52fe51f8
    • R
      ipv6: fix multipath route replace error recovery · 6b9ea5a6
      Roopa Prabhu 提交于
      Problem:
      The ecmp route replace support for ipv6 in the kernel, deletes the
      existing ecmp route too early, ie when it installs the first nexthop.
      If there is an error in installing the subsequent nexthops, its too late
      to recover the already deleted existing route leaving the fib
      in an inconsistent state.
      
      This patch reduces the possibility of this by doing the following:
      a) Changes the existing multipath route add code to a two stage process:
        build rt6_infos + insert them
      	ip6_route_add rt6_info creation code is moved into
      	ip6_route_info_create.
      b) This ensures that most errors are caught during building rt6_infos
        and we fail early
      c) Separates multipath add and del code. Because add needs the special
        two stage mode in a) and delete essentially does not care.
      d) In any event if the code fails during inserting a route again, a
        warning is printed (This should be unlikely)
      
      Before the patch:
      $ip -6 route show
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
      
      /* Try replacing the route with a duplicate nexthop */
      $ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
      fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
      swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
      RTNETLINK answers: File exists
      
      $ip -6 route show
      /* previously added ecmp route 3000:1000:1000:1000::2 dissappears from
       * kernel */
      
      After the patch:
      $ip -6 route show
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
      
      /* Try replacing the route with a duplicate nexthop */
      $ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
      fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
      swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
      RTNETLINK answers: File exists
      
      $ip -6 route show
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
      3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
      
      Fixes: 27596472 ("ipv6: fix ECMP route replacement")
      Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b9ea5a6
  2. 01 9月, 2015 3 次提交
  3. 30 8月, 2015 1 次提交
  4. 25 8月, 2015 1 次提交
  5. 21 8月, 2015 4 次提交
  6. 18 8月, 2015 4 次提交
    • T
      lwt: Add support to redirect dst.input · 25368623
      Tom Herbert 提交于
      This patch adds the capability to redirect dst input in the same way
      that dst output is redirected by LWT.
      
      Also, save the original dst.input and and dst.out when setting up
      lwtunnel redirection. These can be called by the client as a pass-
      through.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25368623
    • M
      ipv6: Fix a potential deadlock when creating pcpu rt · 9c7370a1
      Martin KaFai Lau 提交于
      rt6_make_pcpu_route() is called under read_lock(&table->tb6_lock).
      rt6_make_pcpu_route() calls ip6_rt_pcpu_alloc(rt) which then
      calls dst_alloc().  dst_alloc() _may_ call ip6_dst_gc() which takes
      the write_lock(&tabl->tb6_lock).  A visualized version:
      
      read_lock(&table->tb6_lock);
      rt6_make_pcpu_route();
      => ip6_rt_pcpu_alloc();
      => dst_alloc();
      => ip6_dst_gc();
      => write_lock(&table->tb6_lock); /* oops */
      
      The fix is to do a read_unlock first before calling ip6_rt_pcpu_alloc().
      
      A reported stack:
      
      [141625.537638] INFO: rcu_sched self-detected stall on CPU { 27}  (t=60000 jiffies g=4159086 c=4159085 q=2139)
      [141625.547469] Task dump for CPU 27:
      [141625.550881] mtr             R  running task        0 22121  22081 0x00000008
      [141625.558069]  0000000000000000 ffff88103f363d98 ffffffff8106e488 000000000000001b
      [141625.565641]  ffffffff81684900 ffff88103f363db8 ffffffff810702b0 0000000008000000
      [141625.573220]  ffffffff81684900 ffff88103f363de8 ffffffff8108df9f ffff88103f375a00
      [141625.580803] Call Trace:
      [141625.583345]  <IRQ>  [<ffffffff8106e488>] sched_show_task+0xc1/0xc6
      [141625.589650]  [<ffffffff810702b0>] dump_cpu_task+0x35/0x39
      [141625.595144]  [<ffffffff8108df9f>] rcu_dump_cpu_stacks+0x6a/0x8c
      [141625.601320]  [<ffffffff81090606>] rcu_check_callbacks+0x1f6/0x5d4
      [141625.607669]  [<ffffffff810940c8>] update_process_times+0x2a/0x4f
      [141625.613925]  [<ffffffff8109fbee>] tick_sched_handle+0x32/0x3e
      [141625.619923]  [<ffffffff8109fc2f>] tick_sched_timer+0x35/0x5c
      [141625.625830]  [<ffffffff81094a1f>] __hrtimer_run_queues+0x8f/0x18d
      [141625.632171]  [<ffffffff81094c9e>] hrtimer_interrupt+0xa0/0x166
      [141625.638258]  [<ffffffff8102bf2a>] local_apic_timer_interrupt+0x4e/0x52
      [141625.645036]  [<ffffffff8102c36f>] smp_apic_timer_interrupt+0x39/0x4a
      [141625.651643]  [<ffffffff8140b9e8>] apic_timer_interrupt+0x68/0x70
      [141625.657895]  <EOI>  [<ffffffff81346ee8>] ? dst_destroy+0x7c/0xb5
      [141625.664188]  [<ffffffff813d45b5>] ? fib6_flush_trees+0x20/0x20
      [141625.670272]  [<ffffffff81082b45>] ? queue_write_lock_slowpath+0x60/0x6f
      [141625.677140]  [<ffffffff8140aa33>] _raw_write_lock_bh+0x23/0x25
      [141625.683218]  [<ffffffff813d4553>] __fib6_clean_all+0x40/0x82
      [141625.689124]  [<ffffffff813d45b5>] ? fib6_flush_trees+0x20/0x20
      [141625.695207]  [<ffffffff813d6058>] fib6_clean_all+0xe/0x10
      [141625.700854]  [<ffffffff813d60d3>] fib6_run_gc+0x79/0xc8
      [141625.706329]  [<ffffffff813d0510>] ip6_dst_gc+0x85/0xf9
      [141625.711718]  [<ffffffff81346d68>] dst_alloc+0x55/0x159
      [141625.717105]  [<ffffffff813d09b5>] __ip6_dst_alloc.isra.32+0x19/0x63
      [141625.723620]  [<ffffffff813d1830>] ip6_pol_route+0x36a/0x3e8
      [141625.729441]  [<ffffffff813d18d6>] ip6_pol_route_output+0x11/0x13
      [141625.735700]  [<ffffffff813f02c8>] fib6_rule_action+0xa7/0x1bf
      [141625.741698]  [<ffffffff813d18c5>] ? ip6_pol_route_input+0x17/0x17
      [141625.748043]  [<ffffffff81357c48>] fib_rules_lookup+0xb5/0x12a
      [141625.754050]  [<ffffffff81141628>] ? poll_select_copy_remaining+0xf9/0xf9
      [141625.761002]  [<ffffffff813f0535>] fib6_rule_lookup+0x37/0x5c
      [141625.766914]  [<ffffffff813d18c5>] ? ip6_pol_route_input+0x17/0x17
      [141625.773260]  [<ffffffff813d008c>] ip6_route_output+0x7a/0x82
      [141625.779177]  [<ffffffff813c44c8>] ip6_dst_lookup_tail+0x53/0x112
      [141625.785437]  [<ffffffff813c45c3>] ip6_dst_lookup_flow+0x2a/0x6b
      [141625.791604]  [<ffffffff813ddaab>] rawv6_sendmsg+0x407/0x9b6
      [141625.797423]  [<ffffffff813d7914>] ? do_ipv6_setsockopt.isra.8+0xd87/0xde2
      [141625.804464]  [<ffffffff8139d4b4>] inet_sendmsg+0x57/0x8e
      [141625.810028]  [<ffffffff81329ba3>] sock_sendmsg+0x2e/0x3c
      [141625.815588]  [<ffffffff8132be57>] SyS_sendto+0xfe/0x143
      [141625.821063]  [<ffffffff813dd551>] ? rawv6_setsockopt+0x5e/0x67
      [141625.827146]  [<ffffffff8132c9f8>] ? sock_common_setsockopt+0xf/0x11
      [141625.833660]  [<ffffffff8132c08c>] ? SyS_setsockopt+0x81/0xa2
      [141625.839565]  [<ffffffff8140ac17>] entry_SYSCALL_64_fastpath+0x12/0x6a
      
      Fixes: d52d3997 ("pv6: Create percpu rt6_info")
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Reported-by: NSteinar H. Gunderson <sgunderson@bigfoot.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c7370a1
    • M
      ipv6: Add rt6_make_pcpu_route() · a73e4195
      Martin KaFai Lau 提交于
      It is a prep work for fixing a potential deadlock when creating
      a pcpu rt.
      
      The current rt6_get_pcpu_route() will also create a pcpu rt if one does not
      exist.  This patch moves the pcpu rt creation logic into another function,
      rt6_make_pcpu_route().
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a73e4195
    • M
      ipv6: Remove un-used argument from ip6_dst_alloc() · ad706862
      Martin KaFai Lau 提交于
      After 4b32b5ad ("ipv6: Stop rt6_info from using inet_peer's metrics"),
      ip6_dst_alloc() does not need the 'table' argument.  This patch
      cleans it up.
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad706862
  7. 14 8月, 2015 2 次提交
    • A
      net: ipv6 sysctl option to ignore routes when nexthop link is down · 35103d11
      Andy Gospodarek 提交于
      Like the ipv4 patch with a similar title, this adds a sysctl to allow
      the user to change routing behavior based on whether or not the
      interface associated with the nexthop was an up or down link.  The
      default setting preserves the current behavior, but anyone that enables
      it will notice that nexthops on down interfaces will no longer be
      selected:
      
      net.ipv6.conf.all.ignore_routes_with_linkdown = 0
      net.ipv6.conf.default.ignore_routes_with_linkdown = 0
      net.ipv6.conf.lo.ignore_routes_with_linkdown = 0
      ...
      
      When the above sysctls are set, not only will link status be reported to
      userspace, but an indication that a nexthop is dead and will not be used
      is also reported.
      
      1000::/8 via 7000::2 dev p7p1  metric 1024 dead linkdown  pref medium
      1000::/8 via 8000::2 dev p8p1  metric 1024  pref medium
      7000::/8 dev p7p1  proto kernel  metric 256 dead linkdown  pref medium
      8000::/8 dev p8p1  proto kernel  metric 256  pref medium
      9000::/8 via 8000::2 dev p8p1  metric 2048  pref medium
      9000::/8 via 7000::2 dev p7p1  metric 1024 dead linkdown  pref medium
      fe80::/64 dev p7p1  proto kernel  metric 256 dead linkdown  pref medium
      fe80::/64 dev p8p1  proto kernel  metric 256  pref medium
      
      This also adds devconf support and notification when sysctl values
      change.
      
      v2: drop use of rt6i_nhflags since it is not needed right now
      Signed-off-by: NAndy Gospodarek <gospo@cumulusnetworks.com>
      Signed-off-by: NDinesh Dutt <ddutt@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35103d11
    • A
      net: track link status of ipv6 nexthops · cea45e20
      Andy Gospodarek 提交于
      Add support to track current link status of ipv6 nexthops to match
      recent changes that added support for ipv4 nexthops.  This takes a
      simple approach to track linkdown status for next-hops and simply
      checks the dev for the dst entry and sets proper flags that to be used
      in the netlink message.
      
      v2: drop use of rt6i_nhflags since it is not needed right now
      Signed-off-by: NAndy Gospodarek <gospo@cumulusnetworks.com>
      Signed-off-by: NDinesh Dutt <ddutt@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cea45e20
  8. 11 8月, 2015 1 次提交
  9. 27 7月, 2015 5 次提交
  10. 22 7月, 2015 2 次提交
  11. 04 7月, 2015 1 次提交
  12. 26 5月, 2015 7 次提交
  13. 22 5月, 2015 1 次提交
  14. 21 5月, 2015 2 次提交
    • M
      ipv6: fix ECMP route replacement · 27596472
      Michal Kubeček 提交于
      When replacing an IPv6 multipath route with "ip route replace", i.e.
      NLM_F_CREATE | NLM_F_REPLACE, fib6_add_rt2node() replaces only first
      matching route without fixing its siblings, resulting in corrupted
      siblings linked list; removing one of the siblings can then end in an
      infinite loop.
      
      IPv6 ECMP implementation is a bit different from IPv4 so that route
      replacement cannot work in exactly the same way. This should be a
      reasonable approximation:
      
      1. If the new route is ECMP-able and there is a matching ECMP-able one
      already, replace it and all its siblings (if any).
      
      2. If the new route is ECMP-able and no matching ECMP-able route exists,
      replace first matching non-ECMP-able (if any) or just add the new one.
      
      3. If the new route is not ECMP-able, replace first matching
      non-ECMP-able route (if any) or add the new route.
      
      We also need to remove the NLM_F_REPLACE flag after replacing old
      route(s) by first nexthop of an ECMP route so that each subsequent
      nexthop does not replace previous one.
      
      Fixes: 51ebd318 ("ipv6: add support of equal cost multipath (ECMP)")
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      27596472
    • M
      ipv6: do not delete previously existing ECMP routes if add fails · 35f1b4e9
      Michal Kubeček 提交于
      If adding a nexthop of an IPv6 multipath route fails, comment in
      ip6_route_multipath() says we are going to delete all nexthops already
      added. However, current implementation deletes even the routes it
      hasn't even tried to add yet. For example, running
      
        ip route add 1234:5678::/64 \
            nexthop via fe80::aa dev dummy1 \
            nexthop via fe80::bb dev dummy1 \
            nexthop via fe80::cc dev dummy1
      
      twice results in removing all routes first command added.
      
      Limit the second (delete) run to nexthops that succeeded in the first
      (add) run.
      
      Fixes: 51ebd318 ("ipv6: add support of equal cost multipath (ECMP)")
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35f1b4e9
  15. 10 5月, 2015 1 次提交
    • M
      ipv6: Fixed source specific default route handling. · e16e888b
      Markus Stenberg 提交于
      If there are only IPv6 source specific default routes present, the
      host gets -ENETUNREACH on e.g. connect() because ip6_dst_lookup_tail
      calls ip6_route_output first, and given source address any, it fails,
      and ip6_route_get_saddr is never called.
      
      The change is to use the ip6_route_get_saddr, even if the initial
      ip6_route_output fails, and then doing ip6_route_output _again_ after
      we have appropriate source address available.
      
      Note that this is '99% fix' to the problem; a correct fix would be to
      do route lookups only within addrconf.c when picking a source address,
      and never call ip6_route_output before source address has been
      populated.
      Signed-off-by: NMarkus Stenberg <markus.stenberg@iki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e16e888b
  16. 04 5月, 2015 1 次提交
  17. 02 5月, 2015 2 次提交
    • M
      ipv6: Remove DST_METRICS_FORCE_OVERWRITE and _rt6i_peer · afc4eef8
      Martin KaFai Lau 提交于
      _rt6i_peer is no longer needed after the last patch,
      'ipv6: Stop rt6_info from using inet_peer's metrics'.
      
      DST_METRICS_FORCE_OVERWRITE is added by
      commit e5fd387a ("ipv6: do not overwrite inetpeer metrics prematurely").
      Since inetpeer is no longer used for metrics, this bit is also not needed.
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Michal Kubeček <mkubecek@suse.cz>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      afc4eef8
    • M
      ipv6: Stop rt6_info from using inet_peer's metrics · 4b32b5ad
      Martin KaFai Lau 提交于
      inet_peer is indexed by the dst address alone.  However, the fib6 tree
      could have multiple routing entries (rt6_info) for the same dst. For
      example,
      1. A /128 dst via multiple gateways.
      2. A RTF_CACHE route cloned from a /128 route.
      
      In the above cases, all of them will share the same metrics and
      step on each other.
      
      This patch will steer away from inet_peer's metrics and use
      dst_cow_metrics_generic() for everything.
      
      Change Highlights:
      1. Remove rt6_cow_metrics() which currently acquires metrics from
         inet_peer for DST_HOST route (i.e. /128 route).
      2. Add rt6i_pmtu to take care of the pmtu update to avoid creating a
         full size metrics just to override the RTAX_MTU.
      3. After (2), the RTF_CACHE route can also share the metrics with its
         dst.from route, by:
         dst_init_metrics(&cache_rt->dst, dst_metrics_ptr(cache_rt->dst.from), true);
      4. Stop creating RTF_CACHE route by cloning another RTF_CACHE route.  Instead,
         directly clone from rt->dst.
      
         [ Currently, cloning from another RTF_CACHE is only possible during
           rt6_do_redirect().  Also, the old clone is removed from the tree
           immediately after the new clone is added. ]
      
         In case of cloning from an older redirect RTF_CACHE, it should work as
         before.
      
         In case of cloning from an older pmtu RTF_CACHE, this patch will forget
         the pmtu and re-learn it (if there is any) from the redirected route.
      
      The _rt6i_peer and DST_METRICS_FORCE_OVERWRITE will be removed
      in the next cleanup patch.
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b32b5ad