1. 24 4月, 2016 1 次提交
  2. 22 4月, 2016 2 次提交
    • S
      openvswitch: use flow protocol when recalculating ipv6 checksums · b4f70527
      Simon Horman 提交于
      When using masked actions the ipv6_proto field of an action
      to set IPv6 fields may be zero rather than the prevailing protocol
      which will result in skipping checksum recalculation.
      
      This patch resolves the problem by relying on the protocol
      in the flow key rather than that in the set field action.
      
      Fixes: 83d2b9ba ("net: openvswitch: Support masked set actions.")
      Cc: Jarno Rajahalme <jrajahalme@nicira.com>
      Signed-off-by: NSimon Horman <simon.horman@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b4f70527
    • J
      openvswitch: Orphan skbs before IPv6 defrag · 49e261a8
      Joe Stringer 提交于
      This is the IPv6 counterpart to commit 8282f274 ("inet: frag: Always
      orphan skbs inside ip_defrag()").
      
      Prior to commit 029f7f3b ("netfilter: ipv6: nf_defrag: avoid/free
      clone operations"), ipv6 fragments sent to nf_ct_frag6_gather() would be
      cloned (implicitly orphaning) prior to queueing for reassembly. As such,
      when the IPv6 message is eventually reassembled, the skb->sk for all
      fragments would be NULL. After that commit was introduced, rather than
      cloning, the original skbs were queued directly without orphaning. The
      end result is that all frags except for the first and last may have a
      socket attached.
      
      This commit explicitly orphans such skbs during nf_ct_frag6_gather() to
      prevent BUG_ON(skb->sk) during a later call to ip6_fragment().
      
      kernel BUG at net/ipv6/ip6_output.c:631!
      [...]
      Call Trace:
       <IRQ>
       [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
       [<ffffffffa042c7c0>] ? do_output.isra.28+0x1b0/0x1b0 [openvswitch]
       [<ffffffff810bb8a2>] ? __lock_is_held+0x52/0x70
       [<ffffffffa042c587>] ovs_fragment+0x1f7/0x280 [openvswitch]
       [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
       [<ffffffff817be416>] ? _raw_spin_unlock_irqrestore+0x36/0x50
       [<ffffffff81697ea0>] ? dst_discard_out+0x20/0x20
       [<ffffffff81697e80>] ? dst_ifdown+0x80/0x80
       [<ffffffffa042c703>] do_output.isra.28+0xf3/0x1b0 [openvswitch]
       [<ffffffffa042d279>] do_execute_actions+0x709/0x12c0 [openvswitch]
       [<ffffffffa04340a4>] ? ovs_flow_stats_update+0x74/0x1e0 [openvswitch]
       [<ffffffffa04340d1>] ? ovs_flow_stats_update+0xa1/0x1e0 [openvswitch]
       [<ffffffff817be387>] ? _raw_spin_unlock+0x27/0x40
       [<ffffffffa042de75>] ovs_execute_actions+0x45/0x120 [openvswitch]
       [<ffffffffa0432d65>] ovs_dp_process_packet+0x85/0x150 [openvswitch]
       [<ffffffff817be387>] ? _raw_spin_unlock+0x27/0x40
       [<ffffffffa042def4>] ovs_execute_actions+0xc4/0x120 [openvswitch]
       [<ffffffffa0432d65>] ovs_dp_process_packet+0x85/0x150 [openvswitch]
       [<ffffffffa04337f2>] ? key_extract+0x442/0xc10 [openvswitch]
       [<ffffffffa043b26d>] ovs_vport_receive+0x5d/0xb0 [openvswitch]
       [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
       [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
       [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
       [<ffffffff817be416>] ? _raw_spin_unlock_irqrestore+0x36/0x50
       [<ffffffffa043c11d>] internal_dev_xmit+0x6d/0x150 [openvswitch]
       [<ffffffffa043c0b5>] ? internal_dev_xmit+0x5/0x150 [openvswitch]
       [<ffffffff8168fb5f>] dev_hard_start_xmit+0x2df/0x660
       [<ffffffff8168f5ea>] ? validate_xmit_skb.isra.105.part.106+0x1a/0x2b0
       [<ffffffff81690925>] __dev_queue_xmit+0x8f5/0x950
       [<ffffffff81690080>] ? __dev_queue_xmit+0x50/0x950
       [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
       [<ffffffff81690990>] dev_queue_xmit+0x10/0x20
       [<ffffffff8169a418>] neigh_resolve_output+0x178/0x220
       [<ffffffff81752759>] ? ip6_finish_output2+0x219/0x7b0
       [<ffffffff81752759>] ip6_finish_output2+0x219/0x7b0
       [<ffffffff817525a5>] ? ip6_finish_output2+0x65/0x7b0
       [<ffffffff816cde2b>] ? ip_idents_reserve+0x6b/0x80
       [<ffffffff8175488f>] ? ip6_fragment+0x93f/0xc50
       [<ffffffff81754af1>] ip6_fragment+0xba1/0xc50
       [<ffffffff81752540>] ? ip6_flush_pending_frames+0x40/0x40
       [<ffffffff81754c6b>] ip6_finish_output+0xcb/0x1d0
       [<ffffffff81754dcf>] ip6_output+0x5f/0x1a0
       [<ffffffff81754ba0>] ? ip6_fragment+0xc50/0xc50
       [<ffffffff81797fbd>] ip6_local_out+0x3d/0x80
       [<ffffffff817554df>] ip6_send_skb+0x2f/0xc0
       [<ffffffff817555bd>] ip6_push_pending_frames+0x4d/0x50
       [<ffffffff817796cc>] icmpv6_push_pending_frames+0xac/0xe0
       [<ffffffff8177a4be>] icmpv6_echo_reply+0x42e/0x500
       [<ffffffff8177acbf>] icmpv6_rcv+0x4cf/0x580
       [<ffffffff81755ac7>] ip6_input_finish+0x1a7/0x690
       [<ffffffff81755925>] ? ip6_input_finish+0x5/0x690
       [<ffffffff817567a0>] ip6_input+0x30/0xa0
       [<ffffffff81755920>] ? ip6_rcv_finish+0x1a0/0x1a0
       [<ffffffff817557ce>] ip6_rcv_finish+0x4e/0x1a0
       [<ffffffff8175640f>] ipv6_rcv+0x45f/0x7c0
       [<ffffffff81755fe6>] ? ipv6_rcv+0x36/0x7c0
       [<ffffffff81755780>] ? ip6_make_skb+0x1c0/0x1c0
       [<ffffffff8168b649>] __netif_receive_skb_core+0x229/0xb80
       [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
       [<ffffffff8168c07f>] ? process_backlog+0x6f/0x230
       [<ffffffff8168bfb6>] __netif_receive_skb+0x16/0x70
       [<ffffffff8168c088>] process_backlog+0x78/0x230
       [<ffffffff8168c0ed>] ? process_backlog+0xdd/0x230
       [<ffffffff8168db43>] net_rx_action+0x203/0x480
       [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
       [<ffffffff817c156e>] __do_softirq+0xde/0x49f
       [<ffffffff81752768>] ? ip6_finish_output2+0x228/0x7b0
       [<ffffffff817c070c>] do_softirq_own_stack+0x1c/0x30
       <EOI>
       [<ffffffff8106f88b>] do_softirq.part.18+0x3b/0x40
       [<ffffffff8106f946>] __local_bh_enable_ip+0xb6/0xc0
       [<ffffffff81752791>] ip6_finish_output2+0x251/0x7b0
       [<ffffffff81754af1>] ? ip6_fragment+0xba1/0xc50
       [<ffffffff816cde2b>] ? ip_idents_reserve+0x6b/0x80
       [<ffffffff8175488f>] ? ip6_fragment+0x93f/0xc50
       [<ffffffff81754af1>] ip6_fragment+0xba1/0xc50
       [<ffffffff81752540>] ? ip6_flush_pending_frames+0x40/0x40
       [<ffffffff81754c6b>] ip6_finish_output+0xcb/0x1d0
       [<ffffffff81754dcf>] ip6_output+0x5f/0x1a0
       [<ffffffff81754ba0>] ? ip6_fragment+0xc50/0xc50
       [<ffffffff81797fbd>] ip6_local_out+0x3d/0x80
       [<ffffffff817554df>] ip6_send_skb+0x2f/0xc0
       [<ffffffff817555bd>] ip6_push_pending_frames+0x4d/0x50
       [<ffffffff81778558>] rawv6_sendmsg+0xa28/0xe30
       [<ffffffff81719097>] ? inet_sendmsg+0xc7/0x1d0
       [<ffffffff817190d6>] inet_sendmsg+0x106/0x1d0
       [<ffffffff81718fd5>] ? inet_sendmsg+0x5/0x1d0
       [<ffffffff8166d078>] sock_sendmsg+0x38/0x50
       [<ffffffff8166d4d6>] SYSC_sendto+0xf6/0x170
       [<ffffffff8100201b>] ? trace_hardirqs_on_thunk+0x1b/0x1d
       [<ffffffff8166e38e>] SyS_sendto+0xe/0x10
       [<ffffffff817bebe5>] entry_SYSCALL_64_fastpath+0x18/0xa8
      Code: 06 48 83 3f 00 75 26 48 8b 87 d8 00 00 00 2b 87 d0 00 00 00 48 39 d0 72 14 8b 87 e4 00 00 00 83 f8 01 75 09 48 83 7f 18 00 74 9a <0f> 0b 41 8b 86 cc 00 00 00 49 8#
      RIP  [<ffffffff8175468a>] ip6_fragment+0x73a/0xc50
       RSP <ffff880072803120>
      
      Fixes: 029f7f3b ("netfilter: ipv6: nf_defrag: avoid/free clone
      operations")
      Reported-by: NDaniele Di Proietto <diproiettod@vmware.com>
      Signed-off-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      49e261a8
  3. 17 4月, 2016 1 次提交
  4. 28 3月, 2016 3 次提交
    • A
      openvswitch: call only into reachable nf-nat code · 99b7248e
      Arnd Bergmann 提交于
      The openvswitch code has gained support for calling into the
      nf-nat-ipv4/ipv6 modules, however those can be loadable modules
      in a configuration in which openvswitch is built-in, leading
      to link errors:
      
      net/built-in.o: In function `__ovs_ct_lookup':
      :(.text+0x2cc2c8): undefined reference to `nf_nat_icmp_reply_translation'
      :(.text+0x2cc66c): undefined reference to `nf_nat_icmpv6_reply_translation'
      
      The dependency on (!NF_NAT || NF_NAT) prevents similar issues,
      but NF_NAT is set to 'y' if any of the symbols selecting
      it are built-in, but the link error happens when any of them
      are modular.
      
      A second issue is that even if CONFIG_NF_NAT_IPV6 is built-in,
      CONFIG_NF_NAT_IPV4 might be completely disabled. This is unlikely
      to be useful in practice, but the driver currently only handles
      IPv6 being optional.
      
      This patch improves the Kconfig dependency so that openvswitch
      cannot be built-in if either of the two other symbols are set
      to 'm', and it replaces the incorrect #ifdef in ovs_ct_nat_execute()
      with two "if (IS_ENABLED())" checks that should catch all corner
      cases also make the code more readable.
      
      The same #ifdef exists ovs_ct_nat_to_attr(), where it does not
      cause a link error, but for consistency I'm changing it the same
      way.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: 05752523 ("openvswitch: Interface with NAT.")
      Acked-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      99b7248e
    • J
      openvswitch: Fix checking for new expected connections. · 5745b0be
      Jarno Rajahalme 提交于
      OVS should call into CT NAT for packets of new expected connections only
      when the conntrack state is persisted with the 'commit' option to the
      OVS CT action.  The test for this condition is doubly wrong, as the CT
      status field is ANDed with the bit number (IPS_EXPECTED_BIT) rather
      than the mask (IPS_EXPECTED), and due to the wrong assumption that the
      expected bit would apply only for the first (i.e., 'new') packet of a
      connection, while in fact the expected bit remains on for the lifetime of
      an expected connection.  The 'ctinfo' value IP_CT_RELATED derived from
      the ct status can be used instead, as it is only ever applicable to
      the 'new' packets of the expected connection.
      
      Fixes: 05752523 ('openvswitch: Interface with NAT.')
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NJarno Rajahalme <jarno@ovn.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      5745b0be
    • H
      openvswitch: Use proper buffer size in nla_memcpy · ac71b46e
      Haishuang Yan 提交于
      For the input parameter count, it's better to use the size
      of destination buffer size, as nla_memcpy would take into
      account the length of the source netlink attribute when
      a data is copied from an attribute.
      Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac71b46e
  5. 19 3月, 2016 3 次提交
  6. 15 3月, 2016 7 次提交
  7. 14 3月, 2016 1 次提交
  8. 02 3月, 2016 1 次提交
  9. 20 2月, 2016 2 次提交
  10. 19 2月, 2016 2 次提交
  11. 17 2月, 2016 1 次提交
  12. 15 2月, 2016 1 次提交
  13. 11 2月, 2016 1 次提交
    • T
      openvswitch: allow management from inside user namespaces · 4a92602a
      Tycho Andersen 提交于
      Operations with the GENL_ADMIN_PERM flag fail permissions checks because
      this flag means we call netlink_capable, which uses the init user ns.
      
      Instead, let's introduce a new flag, GENL_UNS_ADMIN_PERM for operations
      which should be allowed inside a user namespace.
      
      The motivation for this is to be able to run openvswitch in unprivileged
      containers. I've tested this and it seems to work, but I really have no
      idea about the security consequences of this patch, so thoughts would be
      much appreciated.
      
      v2: use the GENL_UNS_ADMIN_PERM flag instead of a check in each function
      v3: use separate ifs for UNS_ADMIN_PERM and ADMIN_PERM, instead of one
          massive one
      Reported-by: NJames Page <james.page@canonical.com>
      Signed-off-by: NTycho Andersen <tycho.andersen@canonical.com>
      CC: Eric Biederman <ebiederm@xmission.com>
      CC: Pravin Shelar <pshelar@ovn.org>
      CC: Justin Pettit <jpettit@nicira.com>
      CC: "David S. Miller" <davem@davemloft.net>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a92602a
  14. 10 2月, 2016 1 次提交
    • D
      vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices · 7e059158
      David Wragg 提交于
      Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could
      transmit vxlan packets of any size, constrained only by the ability to
      send out the resulting packets.  4.3 introduced netdevs corresponding
      to tunnel vports.  These netdevs have an MTU, which limits the size of
      a packet that can be successfully encapsulated.  The default MTU
      values are low (1500 or less), which is awkwardly small in the context
      of physical networks supporting jumbo frames, and leads to a
      conspicuous change in behaviour for userspace.
      
      Instead, set the MTU on openvswitch-created netdevs to be the relevant
      maximum (i.e. the maximum IP packet size minus any relevant overhead),
      effectively restoring the behaviour prior to 4.3.
      Signed-off-by: NDavid Wragg <david@weave.works>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7e059158
  15. 19 1月, 2016 1 次提交
    • H
      ovs: limit ovs recursions in ovs_execute_actions to not corrupt stack · b064d0d8
      Hannes Frederic Sowa 提交于
      It was seen that defective configurations of openvswitch could overwrite
      the STACK_END_MAGIC and cause a hard crash of the kernel because of too
      many recursions within ovs.
      
      This problem arises due to the high stack usage of openvswitch. The rest
      of the kernel is fine with the current limit of 10 (RECURSION_LIMIT).
      
      We use the already existing recursion counter in ovs_execute_actions to
      implement an upper bound of 5 recursions.
      
      Cc: Pravin Shelar <pshelar@ovn.org>
      Cc: Simon Horman <simon.horman@netronome.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Simon Horman <simon.horman@netronome.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b064d0d8
  16. 16 1月, 2016 1 次提交
  17. 11 1月, 2016 3 次提交
  18. 30 12月, 2015 1 次提交
    • J
      openvswitch: Fix template leak in error cases. · 90c7afc9
      Joe Stringer 提交于
      Commit 5b48bb8506c5 ("openvswitch: Fix helper reference leak") fixed a
      reference leak on helper objects, but inadvertently introduced a leak on
      the ct template.
      
      Previously, ct_info.ct->general.use was initialized to 0 by
      nf_ct_tmpl_alloc() and only incremented when ovs_ct_copy_action()
      returned successful. If an error occurred while adding the helper or
      adding the action to the actions buffer, the __ovs_ct_free_action()
      cleanup would use nf_ct_put() to free the entry; However, this relies on
      atomic_dec_and_test(ct_info.ct->general.use). This reference must be
      incremented first, or nf_ct_put() will never free it.
      
      Fix the issue by acquiring a reference to the template immediately after
      allocation.
      
      Fixes: cae3a262 ("openvswitch: Allow attaching helpers to ct action")
      Fixes: 5b48bb8506c5 ("openvswitch: Fix helper reference leak")
      Signed-off-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90c7afc9
  19. 19 12月, 2015 1 次提交
  20. 12 12月, 2015 2 次提交
  21. 04 12月, 2015 3 次提交
  22. 03 12月, 2015 1 次提交