1. 01 9月, 2013 1 次提交
  2. 23 8月, 2013 1 次提交
  3. 02 8月, 2013 2 次提交
    • M
      ipv6: update ip6_rt_last_gc every time GC is run · 49a18d86
      Michal Kubeček 提交于
      As pointed out by Eric Dumazet, net->ipv6.ip6_rt_last_gc should
      hold the last time garbage collector was run so that we should
      update it whenever fib6_run_gc() calls fib6_clean_all(), not only
      if we got there from ip6_dst_gc().
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      49a18d86
    • M
      ipv6: prevent fib6_run_gc() contention · 2ac3ac8f
      Michal Kubeček 提交于
      On a high-traffic router with many processors and many IPv6 dst
      entries, soft lockup in fib6_run_gc() can occur when number of
      entries reaches gc_thresh.
      
      This happens because fib6_run_gc() uses fib6_gc_lock to allow
      only one thread to run the garbage collector but ip6_dst_gc()
      doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
      returns. On a system with many entries, this can take some time
      so that in the meantime, other threads pass the tests in
      ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
      the lock. They then have to run the garbage collector one after
      another which blocks them for quite long.
      
      Resolve this by replacing special value ~0UL of expire parameter
      to fib6_run_gc() by explicit "force" parameter to choose between
      spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
      force=false if gc_thresh is reached but not max_size.
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2ac3ac8f
  4. 01 8月, 2013 1 次提交
  5. 25 7月, 2013 1 次提交
  6. 12 7月, 2013 1 次提交
    • H
      ipv6: fix route selection if kernel is not compiled with CONFIG_IPV6_ROUTER_PREF · afc154e9
      Hannes Frederic Sowa 提交于
      This is a follow-up patch to 3630d400
      ("ipv6: rt6_check_neigh should successfully verify neigh if no NUD
      information are available").
      
      Since the removal of rt->n in rt6_info we can end up with a dst ==
      NULL in rt6_check_neigh. In case the kernel is not compiled with
      CONFIG_IPV6_ROUTER_PREF we should also select a route with unkown
      NUD state but we must not avoid doing round robin selection on routes
      with the same target. So introduce and pass down a boolean ``do_rr'' to
      indicate when we should update rt->rr_ptr. As soon as no route is valid
      we do backtracking and do a lookup on a higher level in the fib trie.
      
      v2:
      a) Improved rt6_check_neigh logic (no need to create neighbour there)
         and documented return values.
      
      v3:
      a) Introduce enum rt6_nud_state to get rid of the magic numbers
         (thanks to David Miller).
      b) Update and shorten commit message a bit to actualy reflect
         the source.
      Reported-by: NPierre Emeriaud <petrus.lt@gmail.com>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      afc154e9
  7. 11 7月, 2013 1 次提交
    • H
      ipv6: in case of link failure remove route directly instead of letting it expire · 1eb4f758
      Hannes Frederic Sowa 提交于
      We could end up expiring a route which is part of an ecmp route set. Doing
      so would invalidate the rt->rt6i_nsiblings calculations and could provoke
      the following panic:
      
      [   80.144667] ------------[ cut here ]------------
      [   80.145172] kernel BUG at net/ipv6/ip6_fib.c:733!
      [   80.145172] invalid opcode: 0000 [#1] SMP
      [   80.145172] Modules linked in: 8021q nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables
      +snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer virtio_balloon snd soundcore i2c_piix4 i2c_core virtio_net virtio_blk
      [   80.145172] CPU: 1 PID: 786 Comm: ping6 Not tainted 3.10.0+ #118
      [   80.145172] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [   80.145172] task: ffff880117fa0000 ti: ffff880118770000 task.ti: ffff880118770000
      [   80.145172] RIP: 0010:[<ffffffff815f3b5d>]  [<ffffffff815f3b5d>] fib6_add+0x75d/0x830
      [   80.145172] RSP: 0018:ffff880118771798  EFLAGS: 00010202
      [   80.145172] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88011350e480
      [   80.145172] RDX: ffff88011350e238 RSI: 0000000000000004 RDI: ffff88011350f738
      [   80.145172] RBP: ffff880118771848 R08: ffff880117903280 R09: 0000000000000001
      [   80.145172] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88011350f680
      [   80.145172] R13: ffff880117903280 R14: ffff880118771890 R15: ffff88011350ef90
      [   80.145172] FS:  00007f02b5127740(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000
      [   80.145172] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [   80.145172] CR2: 00007f981322a000 CR3: 00000001181b1000 CR4: 00000000000006e0
      [   80.145172] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   80.145172] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [   80.145172] Stack:
      [   80.145172]  0000000000000001 ffff880100000000 ffff880100000000 ffff880117903280
      [   80.145172]  0000000000000000 ffff880119a4cf00 0000000000000400 00000000000007fa
      [   80.145172]  0000000000000000 0000000000000000 0000000000000000 ffff88011350f680
      [   80.145172] Call Trace:
      [   80.145172]  [<ffffffff815eeceb>] ? rt6_bind_peer+0x4b/0x90
      [   80.145172]  [<ffffffff815ed985>] __ip6_ins_rt+0x45/0x70
      [   80.145172]  [<ffffffff815eee35>] ip6_ins_rt+0x35/0x40
      [   80.145172]  [<ffffffff815ef1e4>] ip6_pol_route.isra.44+0x3a4/0x4b0
      [   80.145172]  [<ffffffff815ef34a>] ip6_pol_route_output+0x2a/0x30
      [   80.145172]  [<ffffffff81616077>] fib6_rule_action+0xd7/0x210
      [   80.145172]  [<ffffffff815ef320>] ? ip6_pol_route_input+0x30/0x30
      [   80.145172]  [<ffffffff81553026>] fib_rules_lookup+0xc6/0x140
      [   80.145172]  [<ffffffff81616374>] fib6_rule_lookup+0x44/0x80
      [   80.145172]  [<ffffffff815ef320>] ? ip6_pol_route_input+0x30/0x30
      [   80.145172]  [<ffffffff815edea3>] ip6_route_output+0x73/0xb0
      [   80.145172]  [<ffffffff815dfdf3>] ip6_dst_lookup_tail+0x2c3/0x2e0
      [   80.145172]  [<ffffffff813007b1>] ? list_del+0x11/0x40
      [   80.145172]  [<ffffffff81082a4c>] ? remove_wait_queue+0x3c/0x50
      [   80.145172]  [<ffffffff815dfe4d>] ip6_dst_lookup_flow+0x3d/0xa0
      [   80.145172]  [<ffffffff815fda77>] rawv6_sendmsg+0x267/0xc20
      [   80.145172]  [<ffffffff815a8a83>] inet_sendmsg+0x63/0xb0
      [   80.145172]  [<ffffffff8128eb93>] ? selinux_socket_sendmsg+0x23/0x30
      [   80.145172]  [<ffffffff815218d6>] sock_sendmsg+0xa6/0xd0
      [   80.145172]  [<ffffffff81524a68>] SYSC_sendto+0x128/0x180
      [   80.145172]  [<ffffffff8109825c>] ? update_curr+0xec/0x170
      [   80.145172]  [<ffffffff81041d09>] ? kvm_clock_get_cycles+0x9/0x10
      [   80.145172]  [<ffffffff810afd1e>] ? __getnstimeofday+0x3e/0xd0
      [   80.145172]  [<ffffffff8152509e>] SyS_sendto+0xe/0x10
      [   80.145172]  [<ffffffff8164efd9>] system_call_fastpath+0x16/0x1b
      [   80.145172] Code: fe ff ff 41 f6 45 2a 06 0f 85 ca fe ff ff 49 8b 7e 08 4c 89 ee e8 94 ef ff ff e9 b9 fe ff ff 48 8b 82 28 05 00 00 e9 01 ff ff ff <0f> 0b 49 8b 54 24 30 0d 00 00 40 00 89 83 14 01 00 00 48 89 53
      [   80.145172] RIP  [<ffffffff815f3b5d>] fib6_add+0x75d/0x830
      [   80.145172]  RSP <ffff880118771798>
      [   80.387413] ---[ end trace 02f20b7a8b81ed95 ]---
      [   80.390154] Kernel panic - not syncing: Fatal exception in interrupt
      
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1eb4f758
  8. 04 7月, 2013 1 次提交
    • H
      ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available · 3630d400
      Hannes Frederic Sowa 提交于
      After the removal of rt->n we do not create a neighbour entry at route
      insertion time (rt6_bind_neighbour is gone). As long as no neighbour is
      created because of "useful traffic" we skip this routing entry because
      rt6_check_neigh cannot pick up a valid neighbour (neigh == NULL) and
      thus returns false.
      
      This change was introduced by commit
      887c95cc ("ipv6: Complete neighbour
      entry removal from dst_entry.")
      
      To quote RFC4191:
      "If the host has no information about the router's reachability, then
      the host assumes the router is reachable."
      
      and also:
      "A host MUST NOT probe a router's reachability in the absence of useful
      traffic that the host would have sent to the router if it were reachable."
      
      So, just assume the router is reachable and let's rt6_probe do the
      rest. We don't need to create a neighbour on route insertion time.
      
      If we don't compile with CONFIG_IPV6_ROUTER_PREF (RFC4191 support)
      a neighbour is only valid if its nud_state is NUD_VALID. I did not find
      any references that we should probe the router on route insertion time
      via the other RFCs. So skip this route in that case.
      
      v2:
      a) use IS_ENABLED instead of #ifdefs (thanks to Sergei Shtylyov)
      Reported-by: NPierre Emeriaud <petrus.lt@gmail.com>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3630d400
  9. 02 7月, 2013 1 次提交
  10. 13 6月, 2013 1 次提交
  11. 29 5月, 2013 2 次提交
  12. 22 3月, 2013 1 次提交
  13. 05 3月, 2013 1 次提交
  14. 21 2月, 2013 1 次提交
  15. 19 2月, 2013 2 次提交
  16. 31 1月, 2013 1 次提交
  17. 22 1月, 2013 1 次提交
  18. 19 1月, 2013 1 次提交
  19. 18 1月, 2013 7 次提交
  20. 15 1月, 2013 1 次提交
  21. 14 1月, 2013 2 次提交
  22. 07 1月, 2013 1 次提交
  23. 04 12月, 2012 1 次提交
  24. 19 11月, 2012 4 次提交
    • E
      net: Enable a userns root rtnl calls that are safe for unprivilged users · b51642f6
      Eric W. Biederman 提交于
      - Only allow moving network devices to network namespaces you have
        CAP_NET_ADMIN privileges over.
      
      - Enable creating/deleting/modifying interfaces
      - Enable adding/deleting addresses
      - Enable adding/setting/deleting neighbour entries
      - Enable adding/removing routes
      - Enable adding/removing fib rules
      - Enable setting the forwarding state
      - Enable adding/removing ipv6 address labels
      - Enable setting bridge parameter
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b51642f6
    • E
      net: Allow userns root to control ipv6 · af31f412
      Eric W. Biederman 提交于
      Allow an unpriviled user who has created a user namespace, and then
      created a network namespace to effectively use the new network
      namespace, by reducing capable(CAP_NET_ADMIN) and
      capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
      CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
      
      Settings that merely control a single network device are allowed.
      Either the network device is a logical network device where
      restrictions make no difference or the network device is hardware NIC
      that has been explicity moved from the initial network namespace.
      
      In general policy and network stack state changes are allowed while
      resource control is left unchanged.
      
      Allow the SIOCSIFADDR ioctl to add ipv6 addresses.
      Allow the SIOCDIFADDR ioctl to delete ipv6 addresses.
      Allow the SIOCADDRT ioctl to add ipv6 routes.
      Allow the SIOCDELRT ioctl to delete ipv6 routes.
      
      Allow creation of ipv6 raw sockets.
      
      Allow setting the IPV6_JOIN_ANYCAST socket option.
      Allow setting the IPV6_FL_A_RENEW parameter of the IPV6_FLOWLABEL_MGR
      socket option.
      
      Allow setting the IPV6_TRANSPARENT socket option.
      Allow setting the IPV6_HOPOPTS socket option.
      Allow setting the IPV6_RTHDRDSTOPTS socket option.
      Allow setting the IPV6_DSTOPTS socket option.
      Allow setting the IPV6_IPSEC_POLICY socket option.
      Allow setting the IPV6_XFRM_POLICY socket option.
      
      Allow sending packets with the IPV6_2292HOPOPTS control message.
      Allow sending packets with the IPV6_2292DSTOPTS control message.
      Allow sending packets with the IPV6_RTHDRDSTOPTS control message.
      
      Allow setting the multicast routing socket options on non multicast
      routing sockets.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, and SIOCDELTUNNEL ioctls for
      setting up, changing and deleting tunnels over ipv6.
      
      Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, SIOCDELTUNNEL ioctls for
      setting up, changing and deleting ipv6 over ipv4 tunnels.
      
      Allow the SIOCADDPRL, SIOCDELPRL, SIOCCHGPRL ioctls for adding,
      deleting, and changing the potential router list for ISATAP tunnels.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af31f412
    • E
      net: Push capable(CAP_NET_ADMIN) into the rtnl methods · dfc47ef8
      Eric W. Biederman 提交于
      - In rtnetlink_rcv_msg convert the capable(CAP_NET_ADMIN) check
        to ns_capable(net->user-ns, CAP_NET_ADMIN).  Allowing unprivileged
        users to make netlink calls to modify their local network
        namespace.
      
      - In the rtnetlink doit methods add capable(CAP_NET_ADMIN) so
        that calls that are not safe for unprivileged users are still
        protected.
      
      Later patches will remove the extra capable calls from methods
      that are safe for unprivilged users.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dfc47ef8
    • E
      net: Don't export sysctls to unprivileged users · 464dc801
      Eric W. Biederman 提交于
      In preparation for supporting the creation of network namespaces
      by unprivileged users, modify all of the per net sysctl exports
      and refuse to allow them to unprivileged users.
      
      This makes it safe for unprivileged users in general to access
      per net sysctls, and allows sysctls to be exported to unprivileged
      users on an individual basis as they are deemed safe.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      464dc801
  25. 15 11月, 2012 1 次提交
  26. 09 11月, 2012 1 次提交
  27. 04 11月, 2012 1 次提交