1. 28 3月, 2018 1 次提交
  2. 27 3月, 2018 5 次提交
  3. 23 3月, 2018 4 次提交
    • C
      gre: fix TUNNEL_SEQ bit check on sequence numbering · 15746394
      Colin Ian King 提交于
      The current logic of flags | TUNNEL_SEQ is always non-zero and hence
      sequence numbers are always incremented no matter the setting of the
      TUNNEL_SEQ bit.  Fix this by using & instead of |.
      
      Detected by CoverityScan, CID#1466039 ("Operands don't affect result")
      
      Fixes: 77a5196a ("gre: add sequence number for collect md mode.")
      Signed-off-by: NColin Ian King <colin.king@canonical.com>
      Acked-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      15746394
    • D
      net/ipv6: Handle onlink flag with multipath routes · 68e2ffde
      David Ahern 提交于
      For multipath routes the ONLINK flag can be specified per nexthop in
      rtnh_flags or globally in rtm_flags. Update ip6_route_multipath_add
      to consider the ONLINK setting coming from rtnh_flags. Each loop over
      nexthops the config for the sibling route is initialized to the global
      config and then per nexthop settings overlayed. The flag is 'or'ed into
      fib6_config to handle the ONLINK flag coming from either rtm_flags or
      rtnh_flags.
      
      Fixes: fc1e64e1 ("net/ipv6: Add support for onlink flag")
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      68e2ffde
    • D
      ipv6: sr: fix NULL pointer dereference when setting encap source address · 8936ef76
      David Lebrun 提交于
      When using seg6 in encap mode, we call ipv6_dev_get_saddr() to set the
      source address of the outer IPv6 header, in case none was specified.
      Using skb->dev can lead to BUG() when it is in an inconsistent state.
      This patch uses the net_device attached to the skb's dst instead.
      
      [940807.667429] BUG: unable to handle kernel NULL pointer dereference at 000000000000047c
      [940807.762427] IP: ipv6_dev_get_saddr+0x8b/0x1d0
      [940807.815725] PGD 0 P4D 0
      [940807.847173] Oops: 0000 [#1] SMP PTI
      [940807.890073] Modules linked in:
      [940807.927765] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G        W        4.16.0-rc1-seg6bpf+ #2
      [940808.028988] Hardware name: HP ProLiant DL120 G6/ProLiant DL120 G6, BIOS O26    09/06/2010
      [940808.128128] RIP: 0010:ipv6_dev_get_saddr+0x8b/0x1d0
      [940808.187667] RSP: 0018:ffff88043fd836b0 EFLAGS: 00010206
      [940808.251366] RAX: 0000000000000005 RBX: ffff88042cb1c860 RCX: 00000000000000fe
      [940808.338025] RDX: 00000000000002c0 RSI: ffff88042cb1c860 RDI: 0000000000004500
      [940808.424683] RBP: ffff88043fd83740 R08: 0000000000000000 R09: ffffffffffffffff
      [940808.511342] R10: 0000000000000040 R11: 0000000000000000 R12: ffff88042cb1c850
      [940808.598012] R13: ffffffff8208e380 R14: ffff88042ac8da00 R15: 0000000000000002
      [940808.684675] FS:  0000000000000000(0000) GS:ffff88043fd80000(0000) knlGS:0000000000000000
      [940808.783036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [940808.852975] CR2: 000000000000047c CR3: 00000004255fe000 CR4: 00000000000006e0
      [940808.939634] Call Trace:
      [940808.970041]  <IRQ>
      [940808.995250]  ? ip6t_do_table+0x265/0x640
      [940809.043341]  seg6_do_srh_encap+0x28f/0x300
      [940809.093516]  ? seg6_do_srh+0x1a0/0x210
      [940809.139528]  seg6_do_srh+0x1a0/0x210
      [940809.183462]  seg6_output+0x28/0x1e0
      [940809.226358]  lwtunnel_output+0x3f/0x70
      [940809.272370]  ip6_xmit+0x2b8/0x530
      [940809.313185]  ? ac6_proc_exit+0x20/0x20
      [940809.359197]  inet6_csk_xmit+0x7d/0xc0
      [940809.404173]  tcp_transmit_skb+0x548/0x9a0
      [940809.453304]  __tcp_retransmit_skb+0x1a8/0x7a0
      [940809.506603]  ? ip6_default_advmss+0x40/0x40
      [940809.557824]  ? tcp_current_mss+0x24/0x90
      [940809.605925]  tcp_retransmit_skb+0xd/0x80
      [940809.654016]  tcp_xmit_retransmit_queue.part.17+0xf9/0x210
      [940809.719797]  tcp_ack+0xa47/0x1110
      [940809.760612]  tcp_rcv_established+0x13c/0x570
      [940809.812865]  tcp_v6_do_rcv+0x151/0x3d0
      [940809.858879]  tcp_v6_rcv+0xa5c/0xb10
      [940809.901770]  ? seg6_output+0xdd/0x1e0
      [940809.946745]  ip6_input_finish+0xbb/0x460
      [940809.994837]  ip6_input+0x74/0x80
      [940810.034612]  ? ip6_rcv_finish+0xb0/0xb0
      [940810.081663]  ipv6_rcv+0x31c/0x4c0
      ...
      
      Fixes: 6c8702c6 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
      Reported-by: NTom Herbert <tom@quantonium.net>
      Signed-off-by: NDavid Lebrun <dlebrun@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8936ef76
    • D
      ipv6: sr: fix scheduling in RCU when creating seg6 lwtunnel state · 191f86ca
      David Lebrun 提交于
      The seg6_build_state() function is called with RCU read lock held,
      so we cannot use GFP_KERNEL. This patch uses GFP_ATOMIC instead.
      
      [   92.770271] =============================
      [   92.770628] WARNING: suspicious RCU usage
      [   92.770921] 4.16.0-rc4+ #12 Not tainted
      [   92.771277] -----------------------------
      [   92.771585] ./include/linux/rcupdate.h:302 Illegal context switch in RCU read-side critical section!
      [   92.772279]
      [   92.772279] other info that might help us debug this:
      [   92.772279]
      [   92.773067]
      [   92.773067] rcu_scheduler_active = 2, debug_locks = 1
      [   92.773514] 2 locks held by ip/2413:
      [   92.773765]  #0:  (rtnl_mutex){+.+.}, at: [<00000000e5461720>] rtnetlink_rcv_msg+0x441/0x4d0
      [   92.774377]  #1:  (rcu_read_lock){....}, at: [<00000000df4f161e>] lwtunnel_build_state+0x59/0x210
      [   92.775065]
      [   92.775065] stack backtrace:
      [   92.775371] CPU: 0 PID: 2413 Comm: ip Not tainted 4.16.0-rc4+ #12
      [   92.775791] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc27 04/01/2014
      [   92.776608] Call Trace:
      [   92.776852]  dump_stack+0x7d/0xbc
      [   92.777130]  __schedule+0x133/0xf00
      [   92.777393]  ? unwind_get_return_address_ptr+0x50/0x50
      [   92.777783]  ? __sched_text_start+0x8/0x8
      [   92.778073]  ? rcu_is_watching+0x19/0x30
      [   92.778383]  ? kernel_text_address+0x49/0x60
      [   92.778800]  ? __kernel_text_address+0x9/0x30
      [   92.779241]  ? unwind_get_return_address+0x29/0x40
      [   92.779727]  ? pcpu_alloc+0x102/0x8f0
      [   92.780101]  _cond_resched+0x23/0x50
      [   92.780459]  __mutex_lock+0xbd/0xad0
      [   92.780818]  ? pcpu_alloc+0x102/0x8f0
      [   92.781194]  ? seg6_build_state+0x11d/0x240
      [   92.781611]  ? save_stack+0x9b/0xb0
      [   92.781965]  ? __ww_mutex_wakeup_for_backoff+0xf0/0xf0
      [   92.782480]  ? seg6_build_state+0x11d/0x240
      [   92.782925]  ? lwtunnel_build_state+0x1bd/0x210
      [   92.783393]  ? ip6_route_info_create+0x687/0x1640
      [   92.783846]  ? ip6_route_add+0x74/0x110
      [   92.784236]  ? inet6_rtm_newroute+0x8a/0xd0
      
      Fixes: 6c8702c6 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
      Signed-off-by: NDavid Lebrun <dlebrun@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      191f86ca
  4. 22 3月, 2018 1 次提交
  5. 21 3月, 2018 1 次提交
  6. 17 3月, 2018 1 次提交
  7. 16 3月, 2018 3 次提交
    • D
      net/ipv6: Add l3mdev check to ipv6_chk_addr_and_flags · 1893ff20
      David Ahern 提交于
      Lookup the L3 master device for the passed in device. Only consider
      addresses on netdev's with the same master device. If the device is
      not enslaved or is NULL, then the l3mdev is NULL which means only
      devices not enslaved (ie, in the default domain) are considered.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1893ff20
    • D
      net/ipv6: Change address check to always take a device argument · 232378e8
      David Ahern 提交于
      ipv6_chk_addr_and_flags determines if an address is a local address and
      optionally if it is an address on a specific device. For example, it is
      called by ip6_route_info_create to determine if a given gateway address
      is a local address. The address check currently does not consider L3
      domains and as a result does not allow a route to be added in one VRF
      if the nexthop points to an address in a second VRF. e.g.,
      
          $ ip route add 2001:db8:1::/64 vrf r2 via 2001:db8:102::23
          Error: Invalid gateway address.
      
      where 2001:db8:102::23 is an address on an interface in vrf r1.
      
      ipv6_chk_addr_and_flags needs to allow callers to always pass in a device
      with a separate argument to not limit the address to the specific device.
      The device is used used to determine the L3 domain of interest.
      
      To that end add an argument to skip the device check and update callers
      to always pass a device where possible and use the new argument to mean
      any address in the domain.
      
      Update a handful of users of ipv6_chk_addr with a NULL dev argument. This
      patch handles the change to these callers without adding the domain check.
      
      ip6_validate_gw needs to handle 2 cases - one where the device is given
      as part of the nexthop spec and the other where the device is resolved.
      There is at least 1 VRF case where deferring the check to only after
      the route lookup has resolved the device fails with an unintuitive error
      "RTNETLINK answers: No route to host" as opposed to the preferred
      "Error: Gateway can not be a local address." The 'no route to host'
      error is because of the fallback to a full lookup. The check is done
      twice to avoid this error.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      232378e8
    • D
      net/ipv6: Refactor gateway validation on route add · 9fbb704c
      David Ahern 提交于
      Move gateway validation code from ip6_route_info_create into
      ip6_validate_gw. Code move plus adjustments to handle the potential
      reset of dev and idev and to make checkpatch happy.
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9fbb704c
  8. 13 3月, 2018 1 次提交
    • P
      net: ipv6: keep sk status consistent after datagram connect failure · 2f987a76
      Paolo Abeni 提交于
      On unsuccesful ip6_datagram_connect(), if the failure is caused by
      ip6_datagram_dst_update(), the sk peer information are cleared, but
      the sk->sk_state is preserved.
      
      If the socket was already in an established status, the overall sk
      status is inconsistent and fouls later checks in datagram code.
      
      Fix this saving the old peer information and restoring them in
      case of failure. This also aligns ipv6 datagram connect() behavior
      with ipv4.
      
      v1 -> v2:
       - added missing Fixes tag
      
      Fixes: 85cb73ff ("net: ipv6: reset daddr and dport in sk if connect() fails")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2f987a76
  9. 12 3月, 2018 1 次提交
  10. 10 3月, 2018 5 次提交
    • W
      ip6erspan: make sure enough headroom at xmit. · e41c7c68
      William Tu 提交于
      The patch adds skb_cow_header() to ensure enough headroom
      at ip6erspan_tunnel_xmit before pushing the erspan header
      to the skb.
      Signed-off-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e41c7c68
    • W
      ip6erspan: improve error handling for erspan version number. · d6aa7119
      William Tu 提交于
      When users fill in incorrect erspan version number through
      the struct erspan_metadata uapi, current code skips pushing
      the erspan header but continue pushing the gre header, which
      is incorrect.  The patch fixes it by returning error.
      Signed-off-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6aa7119
    • W
      ip6gre: add erspan v2 to tunnel lookup · 3b04caab
      William Tu 提交于
      The patch adds the erspan v2 proto in ip6gre_tunnel_lookup
      so the erspan v2 tunnel can be found correctly.
      Signed-off-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b04caab
    • E
      net: do not create fallback tunnels for non-default namespaces · 79134e6c
      Eric Dumazet 提交于
      fallback tunnels (like tunl0, gre0, gretap0, erspan0, sit0,
      ip6tnl0, ip6gre0) are automatically created when the corresponding
      module is loaded.
      
      These tunnels are also automatically created when a new network
      namespace is created, at a great cost.
      
      In many cases, netns are used for isolation purposes, and these
      extra network devices are a waste of resources. We are using
      thousands of netns per host, and hit the netns creation/delete
      bottleneck a lot. (Many thanks to Kirill for recent work on this)
      
      Add a new sysctl so that we can opt-out from this automatic creation.
      
      Note that these tunnels are still created for the initial namespace,
      to be the least intrusive for typical setups.
      
      Tested:
      lpk43:~# cat add_del_unshare.sh
      for i in `seq 1 40`
      do
       (for j in `seq 1 100` ; do  unshare -n /bin/true >/dev/null ; done) &
      done
      wait
      
      lpk43:~# echo 0 >/proc/sys/net/core/fb_tunnels_only_for_init_net
      lpk43:~# time ./add_del_unshare.sh
      
      real	0m37.521s
      user	0m0.886s
      sys	7m7.084s
      lpk43:~# echo 1 >/proc/sys/net/core/fb_tunnels_only_for_init_net
      lpk43:~# time ./add_del_unshare.sh
      
      real	0m4.761s
      user	0m0.851s
      sys	1m8.343s
      lpk43:~#
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79134e6c
    • L
      ipv6: fix access to non-linear packet in ndisc_fill_redirect_hdr_option() · 9f62c15f
      Lorenzo Bianconi 提交于
      Fix the following slab-out-of-bounds kasan report in
      ndisc_fill_redirect_hdr_option when the incoming ipv6 packet is not
      linear and the accessed data are not in the linear data region of orig_skb.
      
      [ 1503.122508] ==================================================================
      [ 1503.122832] BUG: KASAN: slab-out-of-bounds in ndisc_send_redirect+0x94e/0x990
      [ 1503.123036] Read of size 1184 at addr ffff8800298ab6b0 by task netperf/1932
      
      [ 1503.123220] CPU: 0 PID: 1932 Comm: netperf Not tainted 4.16.0-rc2+ #124
      [ 1503.123347] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
      [ 1503.123527] Call Trace:
      [ 1503.123579]  <IRQ>
      [ 1503.123638]  print_address_description+0x6e/0x280
      [ 1503.123849]  kasan_report+0x233/0x350
      [ 1503.123946]  memcpy+0x1f/0x50
      [ 1503.124037]  ndisc_send_redirect+0x94e/0x990
      [ 1503.125150]  ip6_forward+0x1242/0x13b0
      [...]
      [ 1503.153890] Allocated by task 1932:
      [ 1503.153982]  kasan_kmalloc+0x9f/0xd0
      [ 1503.154074]  __kmalloc_track_caller+0xb5/0x160
      [ 1503.154198]  __kmalloc_reserve.isra.41+0x24/0x70
      [ 1503.154324]  __alloc_skb+0x130/0x3e0
      [ 1503.154415]  sctp_packet_transmit+0x21a/0x1810
      [ 1503.154533]  sctp_outq_flush+0xc14/0x1db0
      [ 1503.154624]  sctp_do_sm+0x34e/0x2740
      [ 1503.154715]  sctp_primitive_SEND+0x57/0x70
      [ 1503.154807]  sctp_sendmsg+0xaa6/0x1b10
      [ 1503.154897]  sock_sendmsg+0x68/0x80
      [ 1503.154987]  ___sys_sendmsg+0x431/0x4b0
      [ 1503.155078]  __sys_sendmsg+0xa4/0x130
      [ 1503.155168]  do_syscall_64+0x171/0x3f0
      [ 1503.155259]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      [ 1503.155436] Freed by task 1932:
      [ 1503.155527]  __kasan_slab_free+0x134/0x180
      [ 1503.155618]  kfree+0xbc/0x180
      [ 1503.155709]  skb_release_data+0x27f/0x2c0
      [ 1503.155800]  consume_skb+0x94/0xe0
      [ 1503.155889]  sctp_chunk_put+0x1aa/0x1f0
      [ 1503.155979]  sctp_inq_pop+0x2f8/0x6e0
      [ 1503.156070]  sctp_assoc_bh_rcv+0x6a/0x230
      [ 1503.156164]  sctp_inq_push+0x117/0x150
      [ 1503.156255]  sctp_backlog_rcv+0xdf/0x4a0
      [ 1503.156346]  __release_sock+0x142/0x250
      [ 1503.156436]  release_sock+0x80/0x180
      [ 1503.156526]  sctp_sendmsg+0xbb0/0x1b10
      [ 1503.156617]  sock_sendmsg+0x68/0x80
      [ 1503.156708]  ___sys_sendmsg+0x431/0x4b0
      [ 1503.156799]  __sys_sendmsg+0xa4/0x130
      [ 1503.156889]  do_syscall_64+0x171/0x3f0
      [ 1503.156980]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      [ 1503.157158] The buggy address belongs to the object at ffff8800298ab600
                      which belongs to the cache kmalloc-1024 of size 1024
      [ 1503.157444] The buggy address is located 176 bytes inside of
                      1024-byte region [ffff8800298ab600, ffff8800298aba00)
      [ 1503.157702] The buggy address belongs to the page:
      [ 1503.157820] page:ffffea0000a62a00 count:1 mapcount:0 mapping:0000000000000000 index:0x0 compound_mapcount: 0
      [ 1503.158053] flags: 0x4000000000008100(slab|head)
      [ 1503.158171] raw: 4000000000008100 0000000000000000 0000000000000000 00000001800e000e
      [ 1503.158350] raw: dead000000000100 dead000000000200 ffff880036002600 0000000000000000
      [ 1503.158523] page dumped because: kasan: bad access detected
      
      [ 1503.158698] Memory state around the buggy address:
      [ 1503.158816]  ffff8800298ab900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [ 1503.158988]  ffff8800298ab980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [ 1503.159165] >ffff8800298aba00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1503.159338]                    ^
      [ 1503.159436]  ffff8800298aba80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 1503.159610]  ffff8800298abb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 1503.159785] ==================================================================
      [ 1503.159964] Disabling lock debugging due to kernel taint
      
      The test scenario to trigger the issue consists of 4 devices:
      - H0: data sender, connected to LAN0
      - H1: data receiver, connected to LAN1
      - GW0 and GW1: routers between LAN0 and LAN1. Both of them have an
        ethernet connection on LAN0 and LAN1
      On H{0,1} set GW0 as default gateway while on GW0 set GW1 as next hop for
      data from LAN0 to LAN1.
      Moreover create an ip6ip6 tunnel between H0 and H1 and send 3 concurrent
      data streams (TCP/UDP/SCTP) from H0 to H1 through ip6ip6 tunnel (send
      buffer size is set to 16K). While data streams are active flush the route
      cache on HA multiple times.
      I have not been able to identify a given commit that introduced the issue
      since, using the reproducer described above, the kasan report has been
      triggered from 4.14 and I have not gone back further.
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9f62c15f
  11. 09 3月, 2018 2 次提交
  12. 08 3月, 2018 3 次提交
    • E
      ip6mr: remove synchronize_rcu() in favor of SOCK_RCU_FREE · a366e300
      Eric Dumazet 提交于
      Kirill found that recently added synchronize_rcu() call in
      ip6mr_sk_done()
      was slowing down netns dismantle and posted a patch to use it only if
      the socket
      was found.
      
      I instead suggested to get rid of this call, and use instead
      SOCK_RCU_FREE
      
      We might later change IPv4 side to use the same technique and unify
      both stacks. IPv4 does not use synchronize_rcu() but has a call_rcu()
      that could be replaced by SOCK_RCU_FREE.
      
      Tested:
       time for i in {1..1000}; do unshare -n /bin/false;done
      
       Before : real 7m18.911s
       After : real 10.187s
      
      Fixes: 8571ab47 ("ip6mr: Make mroute_sk rcu-based")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Cc: Yuval Mintz <yuvalm@mellanox.com>
      Reviewed-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a366e300
    • S
      ipv6: Reflect MTU changes on PMTU of exceptions for MTU-less routes · e9fa1495
      Stefano Brivio 提交于
      Currently, administrative MTU changes on a given netdevice are
      not reflected on route exceptions for MTU-less routes, with a
      set PMTU value, for that device:
      
       # ip -6 route get 2001:db8::b
       2001:db8::b from :: dev vti_a proto kernel src 2001:db8::a metric 256 pref medium
       # ping6 -c 1 -q -s10000 2001:db8::b > /dev/null
       # ip netns exec a ip -6 route get 2001:db8::b
       2001:db8::b from :: dev vti_a src 2001:db8::a metric 0
           cache expires 571sec mtu 4926 pref medium
       # ip link set dev vti_a mtu 3000
       # ip -6 route get 2001:db8::b
       2001:db8::b from :: dev vti_a src 2001:db8::a metric 0
           cache expires 571sec mtu 4926 pref medium
       # ip link set dev vti_a mtu 9000
       # ip -6 route get 2001:db8::b
       2001:db8::b from :: dev vti_a src 2001:db8::a metric 0
           cache expires 571sec mtu 4926 pref medium
      
      The first issue is that since commit fb56be83 ("net-ipv6: on
      device mtu change do not add mtu to mtu-less routes") we don't
      call rt6_exceptions_update_pmtu() from rt6_mtu_change_route(),
      which handles administrative MTU changes, if the regular route
      is MTU-less.
      
      However, PMTU exceptions should be always updated, as long as
      RTAX_MTU is not locked. Keep the check for MTU-less main route,
      as introduced by that commit, but, for exceptions,
      call rt6_exceptions_update_pmtu() regardless of that check.
      
      Once that is fixed, one problem remains: MTU changes are not
      reflected if the new MTU is higher than the previous one,
      because rt6_exceptions_update_pmtu() doesn't allow that. We
      should instead allow PMTU increase if the old PMTU matches the
      local MTU, as that implies that the old MTU was the lowest in the
      path, and PMTU discovery might lead to different results.
      
      The existing check in rt6_mtu_change_route() correctly took that
      case into account (for regular routes only), so factor it out
      and re-use it also in rt6_exceptions_update_pmtu().
      
      While at it, fix comments style and grammar, and try to be a bit
      more descriptive.
      Reported-by: NXiumei Mu <xmu@redhat.com>
      Fixes: fb56be83 ("net-ipv6: on device mtu change do not add mtu to mtu-less routes")
      Fixes: f5bbe7ee ("ipv6: prepare rt6_mtu_change() for exception table")
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9fa1495
    • G
      ipv6: ndisc: use true and false for boolean values · 9a21ac94
      Gustavo A. R. Silva 提交于
      Assign true or false to boolean variables instead of an integer value.
      
      This issue was detected with the help of Coccinelle.
      Signed-off-by: NGustavo A. R. Silva <garsilva@embeddedor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a21ac94
  13. 07 3月, 2018 1 次提交
  14. 05 3月, 2018 9 次提交
  15. 02 3月, 2018 2 次提交
    • S
      ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses · f1c02cfb
      Sabrina Dubroca 提交于
      According to RFC 4429 (section 3.1), adding new IPv6 addresses as
      optimistic addresses is acceptable, as long as the implementation
      follows some rules:
      
         * Optimistic DAD SHOULD only be used when the implementation is aware
              that the address is based on a most likely unique interface
              identifier (such as in [RFC2464]), generated randomly [RFC3041],
              or by a well-distributed hash function [RFC3972] or assigned by
              Dynamic Host Configuration Protocol for IPv6 (DHCPv6) [RFC3315].
              Optimistic DAD SHOULD NOT be used for manually entered
              addresses.
      
      Thus, it seems reasonable to allow userspace to set the optimistic flag
      when adding new addresses.
      
      We must not let userspace set NODAD + OPTIMISTIC, since if the kernel is
      not performing DAD we would never clear the optimistic flag. We must
      also ignore userspace's request to add OPTIMISTIC flag to addresses that
      have already completed DAD (addresses that don't have the TENTATIVE
      flag, or that have the DADFAILED flag).
      
      Then we also need to clear the OPTIMISTIC flag on permanent addresses
      when DAD fails. Otherwise, IFA_F_OPTIMISTIC addresses added by userspace
      can still be used after DAD has failed, because in
      ipv6_chk_addr_and_flags(), IFA_F_OPTIMISTIC overrides IFA_F_TENTATIVE.
      
      Setting IFA_F_OPTIMISTIC from userspace is conditional on
      CONFIG_IPV6_OPTIMISTIC_DAD and the optimistic_dad sysctl.
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1c02cfb
    • Y
      ipmr, ip6mr: Unite dumproute flows · 7b0db857
      Yuval Mintz 提交于
      The various MFC entries are being held in the same kind of mr_tables
      for both ipmr and ip6mr, and their traversal logic is identical.
      Also, with the exception of the addresses [and other small tidbits]
      the major bulk of the nla setting is identical.
      
      Unite as much of the dumping as possible between the two.
      Notice this requires creating an mr_table iterator for each, as the
      for-each preprocessor macro can't be used by the common logic.
      Signed-off-by: NYuval Mintz <yuvalm@mellanox.com>
      Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b0db857