1. 10 2月, 2017 6 次提交
  2. 02 2月, 2017 2 次提交
  3. 16 1月, 2017 1 次提交
    • L
      openvswitch: maintain correct checksum state in conntrack actions · 75f01a4c
      Lance Richardson 提交于
      When executing conntrack actions on skbuffs with checksum mode
      CHECKSUM_COMPLETE, the checksum must be updated to account for
      header pushes and pulls. Otherwise we get "hw csum failure"
      logs similar to this (ICMP packet received on geneve tunnel
      via ixgbe NIC):
      
      [  405.740065] genev_sys_6081: hw csum failure
      [  405.740106] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G          I     4.10.0-rc3+ #1
      [  405.740108] Call Trace:
      [  405.740110]  <IRQ>
      [  405.740113]  dump_stack+0x63/0x87
      [  405.740116]  netdev_rx_csum_fault+0x3a/0x40
      [  405.740118]  __skb_checksum_complete+0xcf/0xe0
      [  405.740120]  nf_ip_checksum+0xc8/0xf0
      [  405.740124]  icmp_error+0x1de/0x351 [nf_conntrack_ipv4]
      [  405.740132]  nf_conntrack_in+0xe1/0x550 [nf_conntrack]
      [  405.740137]  ? find_bucket.isra.2+0x62/0x70 [openvswitch]
      [  405.740143]  __ovs_ct_lookup+0x95/0x980 [openvswitch]
      [  405.740145]  ? netif_rx_internal+0x44/0x110
      [  405.740149]  ovs_ct_execute+0x147/0x4b0 [openvswitch]
      [  405.740153]  do_execute_actions+0x22e/0xa70 [openvswitch]
      [  405.740157]  ovs_execute_actions+0x40/0x120 [openvswitch]
      [  405.740161]  ovs_dp_process_packet+0x84/0x120 [openvswitch]
      [  405.740166]  ovs_vport_receive+0x73/0xd0 [openvswitch]
      [  405.740168]  ? udp_rcv+0x1a/0x20
      [  405.740170]  ? ip_local_deliver_finish+0x93/0x1e0
      [  405.740172]  ? ip_local_deliver+0x6f/0xe0
      [  405.740174]  ? ip_rcv_finish+0x3a0/0x3a0
      [  405.740176]  ? ip_rcv_finish+0xdb/0x3a0
      [  405.740177]  ? ip_rcv+0x2a7/0x400
      [  405.740180]  ? __netif_receive_skb_core+0x970/0xa00
      [  405.740185]  netdev_frame_hook+0xd3/0x160 [openvswitch]
      [  405.740187]  __netif_receive_skb_core+0x1dc/0xa00
      [  405.740194]  ? ixgbe_clean_rx_irq+0x46d/0xa20 [ixgbe]
      [  405.740197]  __netif_receive_skb+0x18/0x60
      [  405.740199]  netif_receive_skb_internal+0x40/0xb0
      [  405.740201]  napi_gro_receive+0xcd/0x120
      [  405.740204]  gro_cell_poll+0x57/0x80 [geneve]
      [  405.740206]  net_rx_action+0x260/0x3c0
      [  405.740209]  __do_softirq+0xc9/0x28c
      [  405.740211]  irq_exit+0xd9/0xf0
      [  405.740213]  do_IRQ+0x51/0xd0
      [  405.740215]  common_interrupt+0x93/0x93
      
      Fixes: 7f8a436e ("openvswitch: Add conntrack action")
      Signed-off-by: NLance Richardson <lrichard@redhat.com>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      75f01a4c
  4. 01 12月, 2016 1 次提交
  5. 03 11月, 2016 1 次提交
  6. 05 9月, 2016 1 次提交
  7. 04 8月, 2016 1 次提交
  8. 22 7月, 2016 1 次提交
    • F
      netfilter: conntrack: support a fixed size of 128 distinct labels · 23014011
      Florian Westphal 提交于
      The conntrack label extension is currently variable-sized, e.g. if
      only 2 labels are used by iptables rules then the labels->bits[] array
      will only contain one element.
      
      We track size of each label storage area in the 'words' member.
      
      But in nftables and openvswitch we always have to ask for worst-case
      since we don't know what bit will be used at configuration time.
      
      As most arches are 64bit we need to allocate 24 bytes in this case:
      
      struct nf_conn_labels {
          u8            words;   /*     0     1 */
          /* XXX 7 bytes hole, try to pack */
          long unsigned bits[2]; /*     8     24 */
      
      Make bits a fixed size and drop the words member, it simplifies
      the code and only increases memory requirements on x86 when
      less than 64bit labels are required.
      
      We still only allocate the extension if its needed.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      23014011
  9. 29 6月, 2016 1 次提交
    • S
      openvswitch: fix conntrack netlink event delivery · d913d3a7
      Samuel Gauthier 提交于
      Only the first and last netlink message for a particular conntrack are
      actually sent. The first message is sent through nf_conntrack_confirm when
      the conntrack is committed. The last one is sent when the conntrack is
      destroyed on timeout. The other conntrack state change messages are not
      advertised.
      
      When the conntrack subsystem is used from netfilter, nf_conntrack_confirm
      is called for each packet, from the postrouting hook, which in turn calls
      nf_ct_deliver_cached_events to send the state change netlink messages.
      
      This commit fixes the problem by calling nf_ct_deliver_cached_events in the
      non-commit case as well.
      
      Fixes: 7f8a436e ("openvswitch: Add conntrack action")
      CC: Joe Stringer <joestringer@nicira.com>
      CC: Justin Pettit <jpettit@nicira.com>
      CC: Andy Zhou <azhou@nicira.com>
      CC: Thomas Graf <tgraf@suug.ch>
      Signed-off-by: NSamuel Gauthier <samuel.gauthier@6wind.com>
      Acked-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d913d3a7
  10. 25 6月, 2016 2 次提交
  11. 12 5月, 2016 1 次提交
    • J
      openvswitch: Fix cached ct with helper. · 16ec3d4f
      Joe Stringer 提交于
      When using conntrack helpers from OVS, a common configuration is to
      perform a lookup without specifying a helper, then go through a
      firewalling policy, only to decide to attach a helper afterwards.
      
      In this case, the initial lookup will cause a ct entry to be attached to
      the skb, then the later commit with helper should attach the helper and
      confirm the connection. However, the helper attachment has been missing.
      If the user has enabled automatic helper attachment, then this issue
      will be masked as it will be applied in init_conntrack(). It is also
      masked if the action is executed from ovs_packet_cmd_execute() as that
      will construct a fresh skb.
      
      This patch fixes the issue by making an explicit call to try to assign
      the helper if there is a discrepancy between the action's helper and the
      current skb->nfct.
      
      Fixes: cae3a262 ("openvswitch: Allow attaching helpers to ct action")
      Signed-off-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16ec3d4f
  12. 05 5月, 2016 1 次提交
  13. 22 4月, 2016 1 次提交
    • J
      openvswitch: Orphan skbs before IPv6 defrag · 49e261a8
      Joe Stringer 提交于
      This is the IPv6 counterpart to commit 8282f274 ("inet: frag: Always
      orphan skbs inside ip_defrag()").
      
      Prior to commit 029f7f3b ("netfilter: ipv6: nf_defrag: avoid/free
      clone operations"), ipv6 fragments sent to nf_ct_frag6_gather() would be
      cloned (implicitly orphaning) prior to queueing for reassembly. As such,
      when the IPv6 message is eventually reassembled, the skb->sk for all
      fragments would be NULL. After that commit was introduced, rather than
      cloning, the original skbs were queued directly without orphaning. The
      end result is that all frags except for the first and last may have a
      socket attached.
      
      This commit explicitly orphans such skbs during nf_ct_frag6_gather() to
      prevent BUG_ON(skb->sk) during a later call to ip6_fragment().
      
      kernel BUG at net/ipv6/ip6_output.c:631!
      [...]
      Call Trace:
       <IRQ>
       [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
       [<ffffffffa042c7c0>] ? do_output.isra.28+0x1b0/0x1b0 [openvswitch]
       [<ffffffff810bb8a2>] ? __lock_is_held+0x52/0x70
       [<ffffffffa042c587>] ovs_fragment+0x1f7/0x280 [openvswitch]
       [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
       [<ffffffff817be416>] ? _raw_spin_unlock_irqrestore+0x36/0x50
       [<ffffffff81697ea0>] ? dst_discard_out+0x20/0x20
       [<ffffffff81697e80>] ? dst_ifdown+0x80/0x80
       [<ffffffffa042c703>] do_output.isra.28+0xf3/0x1b0 [openvswitch]
       [<ffffffffa042d279>] do_execute_actions+0x709/0x12c0 [openvswitch]
       [<ffffffffa04340a4>] ? ovs_flow_stats_update+0x74/0x1e0 [openvswitch]
       [<ffffffffa04340d1>] ? ovs_flow_stats_update+0xa1/0x1e0 [openvswitch]
       [<ffffffff817be387>] ? _raw_spin_unlock+0x27/0x40
       [<ffffffffa042de75>] ovs_execute_actions+0x45/0x120 [openvswitch]
       [<ffffffffa0432d65>] ovs_dp_process_packet+0x85/0x150 [openvswitch]
       [<ffffffff817be387>] ? _raw_spin_unlock+0x27/0x40
       [<ffffffffa042def4>] ovs_execute_actions+0xc4/0x120 [openvswitch]
       [<ffffffffa0432d65>] ovs_dp_process_packet+0x85/0x150 [openvswitch]
       [<ffffffffa04337f2>] ? key_extract+0x442/0xc10 [openvswitch]
       [<ffffffffa043b26d>] ovs_vport_receive+0x5d/0xb0 [openvswitch]
       [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
       [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
       [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
       [<ffffffff817be416>] ? _raw_spin_unlock_irqrestore+0x36/0x50
       [<ffffffffa043c11d>] internal_dev_xmit+0x6d/0x150 [openvswitch]
       [<ffffffffa043c0b5>] ? internal_dev_xmit+0x5/0x150 [openvswitch]
       [<ffffffff8168fb5f>] dev_hard_start_xmit+0x2df/0x660
       [<ffffffff8168f5ea>] ? validate_xmit_skb.isra.105.part.106+0x1a/0x2b0
       [<ffffffff81690925>] __dev_queue_xmit+0x8f5/0x950
       [<ffffffff81690080>] ? __dev_queue_xmit+0x50/0x950
       [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
       [<ffffffff81690990>] dev_queue_xmit+0x10/0x20
       [<ffffffff8169a418>] neigh_resolve_output+0x178/0x220
       [<ffffffff81752759>] ? ip6_finish_output2+0x219/0x7b0
       [<ffffffff81752759>] ip6_finish_output2+0x219/0x7b0
       [<ffffffff817525a5>] ? ip6_finish_output2+0x65/0x7b0
       [<ffffffff816cde2b>] ? ip_idents_reserve+0x6b/0x80
       [<ffffffff8175488f>] ? ip6_fragment+0x93f/0xc50
       [<ffffffff81754af1>] ip6_fragment+0xba1/0xc50
       [<ffffffff81752540>] ? ip6_flush_pending_frames+0x40/0x40
       [<ffffffff81754c6b>] ip6_finish_output+0xcb/0x1d0
       [<ffffffff81754dcf>] ip6_output+0x5f/0x1a0
       [<ffffffff81754ba0>] ? ip6_fragment+0xc50/0xc50
       [<ffffffff81797fbd>] ip6_local_out+0x3d/0x80
       [<ffffffff817554df>] ip6_send_skb+0x2f/0xc0
       [<ffffffff817555bd>] ip6_push_pending_frames+0x4d/0x50
       [<ffffffff817796cc>] icmpv6_push_pending_frames+0xac/0xe0
       [<ffffffff8177a4be>] icmpv6_echo_reply+0x42e/0x500
       [<ffffffff8177acbf>] icmpv6_rcv+0x4cf/0x580
       [<ffffffff81755ac7>] ip6_input_finish+0x1a7/0x690
       [<ffffffff81755925>] ? ip6_input_finish+0x5/0x690
       [<ffffffff817567a0>] ip6_input+0x30/0xa0
       [<ffffffff81755920>] ? ip6_rcv_finish+0x1a0/0x1a0
       [<ffffffff817557ce>] ip6_rcv_finish+0x4e/0x1a0
       [<ffffffff8175640f>] ipv6_rcv+0x45f/0x7c0
       [<ffffffff81755fe6>] ? ipv6_rcv+0x36/0x7c0
       [<ffffffff81755780>] ? ip6_make_skb+0x1c0/0x1c0
       [<ffffffff8168b649>] __netif_receive_skb_core+0x229/0xb80
       [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
       [<ffffffff8168c07f>] ? process_backlog+0x6f/0x230
       [<ffffffff8168bfb6>] __netif_receive_skb+0x16/0x70
       [<ffffffff8168c088>] process_backlog+0x78/0x230
       [<ffffffff8168c0ed>] ? process_backlog+0xdd/0x230
       [<ffffffff8168db43>] net_rx_action+0x203/0x480
       [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
       [<ffffffff817c156e>] __do_softirq+0xde/0x49f
       [<ffffffff81752768>] ? ip6_finish_output2+0x228/0x7b0
       [<ffffffff817c070c>] do_softirq_own_stack+0x1c/0x30
       <EOI>
       [<ffffffff8106f88b>] do_softirq.part.18+0x3b/0x40
       [<ffffffff8106f946>] __local_bh_enable_ip+0xb6/0xc0
       [<ffffffff81752791>] ip6_finish_output2+0x251/0x7b0
       [<ffffffff81754af1>] ? ip6_fragment+0xba1/0xc50
       [<ffffffff816cde2b>] ? ip_idents_reserve+0x6b/0x80
       [<ffffffff8175488f>] ? ip6_fragment+0x93f/0xc50
       [<ffffffff81754af1>] ip6_fragment+0xba1/0xc50
       [<ffffffff81752540>] ? ip6_flush_pending_frames+0x40/0x40
       [<ffffffff81754c6b>] ip6_finish_output+0xcb/0x1d0
       [<ffffffff81754dcf>] ip6_output+0x5f/0x1a0
       [<ffffffff81754ba0>] ? ip6_fragment+0xc50/0xc50
       [<ffffffff81797fbd>] ip6_local_out+0x3d/0x80
       [<ffffffff817554df>] ip6_send_skb+0x2f/0xc0
       [<ffffffff817555bd>] ip6_push_pending_frames+0x4d/0x50
       [<ffffffff81778558>] rawv6_sendmsg+0xa28/0xe30
       [<ffffffff81719097>] ? inet_sendmsg+0xc7/0x1d0
       [<ffffffff817190d6>] inet_sendmsg+0x106/0x1d0
       [<ffffffff81718fd5>] ? inet_sendmsg+0x5/0x1d0
       [<ffffffff8166d078>] sock_sendmsg+0x38/0x50
       [<ffffffff8166d4d6>] SYSC_sendto+0xf6/0x170
       [<ffffffff8100201b>] ? trace_hardirqs_on_thunk+0x1b/0x1d
       [<ffffffff8166e38e>] SyS_sendto+0xe/0x10
       [<ffffffff817bebe5>] entry_SYSCALL_64_fastpath+0x18/0xa8
      Code: 06 48 83 3f 00 75 26 48 8b 87 d8 00 00 00 2b 87 d0 00 00 00 48 39 d0 72 14 8b 87 e4 00 00 00 83 f8 01 75 09 48 83 7f 18 00 74 9a <0f> 0b 41 8b 86 cc 00 00 00 49 8#
      RIP  [<ffffffff8175468a>] ip6_fragment+0x73a/0xc50
       RSP <ffff880072803120>
      
      Fixes: 029f7f3b ("netfilter: ipv6: nf_defrag: avoid/free clone
      operations")
      Reported-by: NDaniele Di Proietto <diproiettod@vmware.com>
      Signed-off-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      49e261a8
  14. 19 4月, 2016 1 次提交
  15. 28 3月, 2016 3 次提交
    • A
      openvswitch: call only into reachable nf-nat code · 99b7248e
      Arnd Bergmann 提交于
      The openvswitch code has gained support for calling into the
      nf-nat-ipv4/ipv6 modules, however those can be loadable modules
      in a configuration in which openvswitch is built-in, leading
      to link errors:
      
      net/built-in.o: In function `__ovs_ct_lookup':
      :(.text+0x2cc2c8): undefined reference to `nf_nat_icmp_reply_translation'
      :(.text+0x2cc66c): undefined reference to `nf_nat_icmpv6_reply_translation'
      
      The dependency on (!NF_NAT || NF_NAT) prevents similar issues,
      but NF_NAT is set to 'y' if any of the symbols selecting
      it are built-in, but the link error happens when any of them
      are modular.
      
      A second issue is that even if CONFIG_NF_NAT_IPV6 is built-in,
      CONFIG_NF_NAT_IPV4 might be completely disabled. This is unlikely
      to be useful in practice, but the driver currently only handles
      IPv6 being optional.
      
      This patch improves the Kconfig dependency so that openvswitch
      cannot be built-in if either of the two other symbols are set
      to 'm', and it replaces the incorrect #ifdef in ovs_ct_nat_execute()
      with two "if (IS_ENABLED())" checks that should catch all corner
      cases also make the code more readable.
      
      The same #ifdef exists ovs_ct_nat_to_attr(), where it does not
      cause a link error, but for consistency I'm changing it the same
      way.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: 05752523 ("openvswitch: Interface with NAT.")
      Acked-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      99b7248e
    • J
      openvswitch: Fix checking for new expected connections. · 5745b0be
      Jarno Rajahalme 提交于
      OVS should call into CT NAT for packets of new expected connections only
      when the conntrack state is persisted with the 'commit' option to the
      OVS CT action.  The test for this condition is doubly wrong, as the CT
      status field is ANDed with the bit number (IPS_EXPECTED_BIT) rather
      than the mask (IPS_EXPECTED), and due to the wrong assumption that the
      expected bit would apply only for the first (i.e., 'new') packet of a
      connection, while in fact the expected bit remains on for the lifetime of
      an expected connection.  The 'ctinfo' value IP_CT_RELATED derived from
      the ct status can be used instead, as it is only ever applicable to
      the 'new' packets of the expected connection.
      
      Fixes: 05752523 ('openvswitch: Interface with NAT.')
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NJarno Rajahalme <jarno@ovn.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      5745b0be
    • H
      openvswitch: Use proper buffer size in nla_memcpy · ac71b46e
      Haishuang Yan 提交于
      For the input parameter count, it's better to use the size
      of destination buffer size, as nla_memcpy would take into
      account the length of the source netlink attribute when
      a data is copied from an attribute.
      Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac71b46e
  16. 15 3月, 2016 7 次提交
  17. 30 12月, 2015 1 次提交
    • J
      openvswitch: Fix template leak in error cases. · 90c7afc9
      Joe Stringer 提交于
      Commit 5b48bb8506c5 ("openvswitch: Fix helper reference leak") fixed a
      reference leak on helper objects, but inadvertently introduced a leak on
      the ct template.
      
      Previously, ct_info.ct->general.use was initialized to 0 by
      nf_ct_tmpl_alloc() and only incremented when ovs_ct_copy_action()
      returned successful. If an error occurred while adding the helper or
      adding the action to the actions buffer, the __ovs_ct_free_action()
      cleanup would use nf_ct_put() to free the entry; However, this relies on
      atomic_dec_and_test(ct_info.ct->general.use). This reference must be
      incremented first, or nf_ct_put() will never free it.
      
      Fix the issue by acquiring a reference to the template immediately after
      allocation.
      
      Fixes: cae3a262 ("openvswitch: Allow attaching helpers to ct action")
      Fixes: 5b48bb8506c5 ("openvswitch: Fix helper reference leak")
      Signed-off-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90c7afc9
  18. 12 12月, 2015 2 次提交
  19. 24 11月, 2015 2 次提交
    • F
      netfilter: ipv6: avoid nf_iterate recursion · daaa7d64
      Florian Westphal 提交于
      The previous patch changed nf_ct_frag6_gather() to morph reassembled skb
      with the previous one.
      
      This means that the return value is always NULL or the skb argument.
      So change it to an err value.
      
      Instead of invoking NF_HOOK recursively with threshold to skip already-called hooks
      we can now just return NF_ACCEPT to move on to the next hook except for
      -EINPROGRESS (which means skb has been queued for reassembly), in which case we
      return NF_STOLEN.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      daaa7d64
    • F
      netfilter: ipv6: nf_defrag: avoid/free clone operations · 029f7f3b
      Florian Westphal 提交于
      commit 6aafeef0
      ("netfilter: push reasm skb through instead of original frag skbs")
      changed ipv6 defrag to not use the original skbs anymore.
      
      So rather than keeping the original skbs around just to discard them
      afterwards just use the original skbs directly for the fraglist of
      the newly assembled skb and remove the extra clone/free operations.
      
      The skb that completes the fragment queue is morphed into a the
      reassembled one instead, just like ipv4 defrag.
      
      openvswitch doesn't need any additional skb_morph magic anymore to deal
      with this situation so just remove that.
      
      A followup patch can then also remove the NF_HOOK (re)invocation in
      the ipv6 netfilter defrag hook.
      
      Cc: Joe Stringer <joestringer@nicira.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      029f7f3b
  20. 28 10月, 2015 2 次提交
  21. 22 10月, 2015 2 次提交