1. 06 4月, 2022 2 次提交
    • I
      net: openvswitch: fix leak of nested actions · 1f30fb91
      Ilya Maximets 提交于
      While parsing user-provided actions, openvswitch module may dynamically
      allocate memory and store pointers in the internal copy of the actions.
      So this memory has to be freed while destroying the actions.
      
      Currently there are only two such actions: ct() and set().  However,
      there are many actions that can hold nested lists of actions and
      ovs_nla_free_flow_actions() just jumps over them leaking the memory.
      
      For example, removal of the flow with the following actions will lead
      to a leak of the memory allocated by nf_ct_tmpl_alloc():
      
        actions:clone(ct(commit),0)
      
      Non-freed set() action may also leak the 'dst' structure for the
      tunnel info including device references.
      
      Under certain conditions with a high rate of flow rotation that may
      cause significant memory leak problem (2MB per second in reporter's
      case).  The problem is also hard to mitigate, because the user doesn't
      have direct control over the datapath flows generated by OVS.
      
      Fix that by iterating over all the nested actions and freeing
      everything that needs to be freed recursively.
      
      New build time assertion should protect us from this problem if new
      actions will be added in the future.
      
      Unfortunately, openvswitch module doesn't use NLA_F_NESTED, so all
      attributes has to be explicitly checked.  sample() and clone() actions
      are mixing extra attributes into the user-provided action list.  That
      prevents some code generalization too.
      
      Fixes: 34ae932a ("openvswitch: Make tunnel set action attach a metadata dst")
      Link: https://mail.openvswitch.org/pipermail/ovs-dev/2022-March/392922.htmlReported-by: NStéphane Graber <stgraber@ubuntu.com>
      Signed-off-by: NIlya Maximets <i.maximets@ovn.org>
      Acked-by: NAaron Conole <aconole@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f30fb91
    • I
      net: openvswitch: don't send internal clone attribute to the userspace. · 3f2a3050
      Ilya Maximets 提交于
      'OVS_CLONE_ATTR_EXEC' is an internal attribute that is used for
      performance optimization inside the kernel.  It's added by the kernel
      while parsing user-provided actions and should not be sent during the
      flow dump as it's not part of the uAPI.
      
      The issue doesn't cause any significant problems to the ovs-vswitchd
      process, because reported actions are not really used in the
      application lifecycle and only supposed to be shown to a human via
      ovs-dpctl flow dump.  However, the action list is still incorrect
      and causes the following error if the user wants to look at the
      datapath flows:
      
        # ovs-dpctl add-dp system@ovs-system
        # ovs-dpctl add-flow "<flow match>" "clone(ct(commit),0)"
        # ovs-dpctl dump-flows
        <flow match>, packets:0, bytes:0, used:never,
          actions:clone(bad length 4, expected -1 for: action0(01 00 00 00),
                        ct(commit),0)
      
      With the fix:
      
        # ovs-dpctl dump-flows
        <flow match>, packets:0, bytes:0, used:never,
          actions:clone(ct(commit),0)
      
      Additionally fixed an incorrect attribute name in the comment.
      
      Fixes: b2335040 ("openvswitch: kernel datapath clone action")
      Signed-off-by: NIlya Maximets <i.maximets@ovn.org>
      Acked-by: NAaron Conole <aconole@redhat.com>
      Link: https://lore.kernel.org/r/20220404104150.2865736-1-i.maximets@ovn.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>
      3f2a3050
  2. 29 3月, 2022 1 次提交
  3. 11 3月, 2022 1 次提交
  4. 25 2月, 2022 1 次提交
  5. 15 1月, 2021 1 次提交
  6. 05 12月, 2020 1 次提交
  7. 28 11月, 2020 1 次提交
  8. 14 7月, 2020 1 次提交
  9. 21 2月, 2020 1 次提交
    • K
      openvswitch: Distribute switch variables for initialization · 16a556ee
      Kees Cook 提交于
      Variables declared in a switch statement before any case statements
      cannot be automatically initialized with compiler instrumentation (as
      they are not part of any execution flow). With GCC's proposed automatic
      stack variable initialization feature, this triggers a warning (and they
      don't get initialized). Clang's automatic stack variable initialization
      (via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
      doesn't initialize such variables[1]. Note that these warnings (or silent
      skipping) happen before the dead-store elimination optimization phase,
      so even when the automatic initializations are later elided in favor of
      direct initializations, the warnings remain.
      
      To avoid these problems, move such variables into the "case" where
      they're used or lift them up into the main function body.
      
      net/openvswitch/flow_netlink.c: In function ‘validate_set’:
      net/openvswitch/flow_netlink.c:2711:29: warning: statement will never be executed [-Wswitch-unreachable]
       2711 |  const struct ovs_key_ipv4 *ipv4_key;
            |                             ^~~~~~~~
      
      [1] https://bugs.llvm.org/show_bug.cgi?id=44916Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      16a556ee
  10. 17 2月, 2020 1 次提交
    • M
      openvswitch: add TTL decrement action · 744676e7
      Matteo Croce 提交于
      New action to decrement TTL instead of setting it to a fixed value.
      This action will decrement the TTL and, in case of expired TTL, drop it
      or execute an action passed via a nested attribute.
      The default TTL expired action is to drop the packet.
      
      Supports both IPv4 and IPv6 via the ttl and hop_limit fields, respectively.
      
      Tested with a corresponding change in the userspace:
      
          # ovs-dpctl dump-flows
          in_port(2),eth(),eth_type(0x0800), packets:0, bytes:0, used:never, actions:dec_ttl{ttl<=1 action:(drop)},1
          in_port(1),eth(),eth_type(0x0800), packets:0, bytes:0, used:never, actions:dec_ttl{ttl<=1 action:(drop)},2
          in_port(1),eth(),eth_type(0x0806), packets:0, bytes:0, used:never, actions:2
          in_port(2),eth(),eth_type(0x0806), packets:0, bytes:0, used:never, actions:1
      
          # ping -c1 192.168.0.2 -t 42
          IP (tos 0x0, ttl 41, id 61647, offset 0, flags [DF], proto ICMP (1), length 84)
              192.168.0.1 > 192.168.0.2: ICMP echo request, id 386, seq 1, length 64
          # ping -c1 192.168.0.2 -t 120
          IP (tos 0x0, ttl 119, id 62070, offset 0, flags [DF], proto ICMP (1), length 84)
              192.168.0.1 > 192.168.0.2: ICMP echo request, id 388, seq 1, length 64
          # ping -c1 192.168.0.2 -t 1
          #
      Co-developed-by: NBindiya Kurle <bindiyakurle@gmail.com>
      Signed-off-by: NBindiya Kurle <bindiyakurle@gmail.com>
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      744676e7
  11. 25 12月, 2019 1 次提交
  12. 06 11月, 2019 1 次提交
  13. 05 6月, 2019 1 次提交
  14. 28 4月, 2019 2 次提交
    • J
      netlink: make validation more configurable for future strictness · 8cb08174
      Johannes Berg 提交于
      We currently have two levels of strict validation:
      
       1) liberal (default)
           - undefined (type >= max) & NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
           - garbage at end of message accepted
       2) strict (opt-in)
           - NLA_UNSPEC attributes accepted
           - attribute length >= expected accepted
      
      Split out parsing strictness into four different options:
       * TRAILING     - check that there's no trailing data after parsing
                        attributes (in message or nested)
       * MAXTYPE      - reject attrs > max known type
       * UNSPEC       - reject attributes with NLA_UNSPEC policy entries
       * STRICT_ATTRS - strictly validate attribute size
      
      The default for future things should be *everything*.
      The current *_strict() is a combination of TRAILING and MAXTYPE,
      and is renamed to _deprecated_strict().
      The current regular parsing has none of this, and is renamed to
      *_parse_deprecated().
      
      Additionally it allows us to selectively set one of the new flags
      even on old policies. Notably, the UNSPEC flag could be useful in
      this case, since it can be arranged (by filling in the policy) to
      not be an incompatible userspace ABI change, but would then going
      forward prevent forgetting attribute entries. Similar can apply
      to the POLICY flag.
      
      We end up with the following renames:
       * nla_parse           -> nla_parse_deprecated
       * nla_parse_strict    -> nla_parse_deprecated_strict
       * nlmsg_parse         -> nlmsg_parse_deprecated
       * nlmsg_parse_strict  -> nlmsg_parse_deprecated_strict
       * nla_parse_nested    -> nla_parse_nested_deprecated
       * nla_validate_nested -> nla_validate_nested_deprecated
      
      Using spatch, of course:
          @@
          expression TB, MAX, HEAD, LEN, POL, EXT;
          @@
          -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
          +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, TB, MAX, POL, EXT;
          @@
          -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
          +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
      
          @@
          expression TB, MAX, NLA, POL, EXT;
          @@
          -nla_parse_nested(TB, MAX, NLA, POL, EXT)
          +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)
      
          @@
          expression START, MAX, POL, EXT;
          @@
          -nla_validate_nested(START, MAX, POL, EXT)
          +nla_validate_nested_deprecated(START, MAX, POL, EXT)
      
          @@
          expression NLH, HDRLEN, MAX, POL, EXT;
          @@
          -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
          +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)
      
      For this patch, don't actually add the strict, non-renamed versions
      yet so that it breaks compile if I get it wrong.
      
      Also, while at it, make nla_validate and nla_parse go down to a
      common __nla_validate_parse() function to avoid code duplication.
      
      Ultimately, this allows us to have very strict validation for every
      new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
      next patch, while existing things will continue to work as is.
      
      In effect then, this adds fully strict validation for any new command.
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8cb08174
    • M
      netlink: make nla_nest_start() add NLA_F_NESTED flag · ae0be8de
      Michal Kubecek 提交于
      Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
      netlink based interfaces (including recently added ones) are still not
      setting it in kernel generated messages. Without the flag, message parsers
      not aware of attribute semantics (e.g. wireshark dissector or libmnl's
      mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
      the structure of their contents.
      
      Unfortunately we cannot just add the flag everywhere as there may be
      userspace applications which check nlattr::nla_type directly rather than
      through a helper masking out the flags. Therefore the patch renames
      nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
      as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
      are rewritten to use nla_nest_start().
      
      Except for changes in include/net/netlink.h, the patch was generated using
      this semantic patch:
      
      @@ expression E1, E2; @@
      -nla_nest_start(E1, E2)
      +nla_nest_start_noflag(E1, E2)
      
      @@ expression E1, E2; @@
      -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
      +nla_nest_start(E1, E2)
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae0be8de
  15. 30 3月, 2019 1 次提交
    • W
      openvswitch: Make metadata_dst tunnel work in IP_TUNNEL_INFO_BRIDGE mode · 18b6f717
      wenxu 提交于
      There is currently no support for the multicast/broadcast aspects
      of VXLAN in ovs. In the datapath flow the tun_dst must specific.
      But in the IP_TUNNEL_INFO_BRIDGE mode the tun_dst can not be specific.
      And the packet can forward through the fdb table of vxlan devcice. In
      this mode the broadcast/multicast packet can be sent through the
      following ways in ovs.
      
      ovs-vsctl add-port br0 vxlan -- set in vxlan type=vxlan \
              options:key=1000 options:remote_ip=flow
      ovs-ofctl add-flow br0 in_port=LOCAL,dl_dst=ff:ff:ff:ff:ff:ff, \
              action=output:vxlan
      
      bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
              src_vni 1000 vni 1000 self
      bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
              src_vni 1000 vni 1000 self
      Signed-off-by: Nwenxu <wenxu@ucloud.cn>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      18b6f717
  16. 29 3月, 2019 1 次提交
    • A
      openvswitch: fix flow actions reallocation · f28cd2af
      Andrea Righi 提交于
      The flow action buffer can be resized if it's not big enough to contain
      all the requested flow actions. However, this resize doesn't take into
      account the new requested size, the buffer is only increased by a factor
      of 2x. This might be not enough to contain the new data, causing a
      buffer overflow, for example:
      
      [   42.044472] =============================================================================
      [   42.045608] BUG kmalloc-96 (Not tainted): Redzone overwritten
      [   42.046415] -----------------------------------------------------------------------------
      
      [   42.047715] Disabling lock debugging due to kernel taint
      [   42.047716] INFO: 0x8bf2c4a5-0x720c0928. First byte 0x0 instead of 0xcc
      [   42.048677] INFO: Slab 0xbc6d2040 objects=29 used=18 fp=0xdc07dec4 flags=0x2808101
      [   42.049743] INFO: Object 0xd53a3464 @offset=2528 fp=0xccdcdebb
      
      [   42.050747] Redzone 76f1b237: cc cc cc cc cc cc cc cc                          ........
      [   42.051839] Object d53a3464: 6b 6b 6b 6b 6b 6b 6b 6b 0c 00 00 00 6c 00 00 00  kkkkkkkk....l...
      [   42.053015] Object f49a30cc: 6c 00 0c 00 00 00 00 00 00 00 00 03 78 a3 15 f6  l...........x...
      [   42.054203] Object acfe4220: 20 00 02 00 ff ff ff ff 00 00 00 00 00 00 00 00   ...............
      [   42.055370] Object 21024e91: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [   42.056541] Object 070e04c3: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [   42.057797] Object 948a777a: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [   42.059061] Redzone 8bf2c4a5: 00 00 00 00                                      ....
      [   42.060189] Padding a681b46e: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
      
      Fix by making sure the new buffer is properly resized to contain all the
      requested data.
      
      BugLink: https://bugs.launchpad.net/bugs/1813244Signed-off-by: NAndrea Righi <andrea.righi@canonical.com>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f28cd2af
  17. 28 3月, 2019 1 次提交
    • N
      net: openvswitch: Add a new action check_pkt_len · 4d5ec89f
      Numan Siddique 提交于
      This patch adds a new action - 'check_pkt_len' which checks the
      packet length and executes a set of actions if the packet
      length is greater than the specified length or executes
      another set of actions if the packet length is lesser or equal to.
      
      This action takes below nlattrs
        * OVS_CHECK_PKT_LEN_ATTR_PKT_LEN - 'pkt_len' to check for
      
        * OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER - Nested actions
          to apply if the packet length is greater than the specified 'pkt_len'
      
        * OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL - Nested
          actions to apply if the packet length is lesser or equal to the
          specified 'pkt_len'.
      
      The main use case for adding this action is to solve the packet
      drops because of MTU mismatch in OVN virtual networking solution.
      When a VM (which belongs to a logical switch of OVN) sends a packet
      destined to go via the gateway router and if the nic which provides
      external connectivity, has a lesser MTU, OVS drops the packet
      if the packet length is greater than this MTU.
      
      With the help of this action, OVN will check the packet length
      and if it is greater than the MTU size, it will generate an
      ICMP packet (type 3, code 4) and includes the next hop mtu in it
      so that the sender can fragment the packets.
      
      Reported-at:
      https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.htmlSuggested-by: NBen Pfaff <blp@ovn.org>
      Signed-off-by: NNuman Siddique <nusiddiq@redhat.com>
      CC: Gregory Rose <gvrose8192@gmail.com>
      CC: Pravin B Shelar <pshelar@ovn.org>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Tested-by: NGreg Rose <gvrose8192@gmail.com>
      Reviewed-by: NGreg Rose <gvrose8192@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d5ec89f
  18. 17 1月, 2019 1 次提交
  19. 09 11月, 2018 1 次提交
  20. 01 11月, 2018 1 次提交
  21. 08 7月, 2018 1 次提交
  22. 29 6月, 2018 1 次提交
  23. 05 5月, 2018 1 次提交
    • S
      openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found · 72f17baf
      Stefano Brivio 提交于
      If an OVS_ATTR_NESTED attribute type is found while walking
      through netlink attributes, we call nlattr_set() recursively
      passing the length table for the following nested attributes, if
      different from the current one.
      
      However, once we're done with those sub-nested attributes, we
      should continue walking through attributes using the current
      table, instead of using the one related to the sub-nested
      attributes.
      
      For example, given this sequence:
      
      1  OVS_KEY_ATTR_PRIORITY
      2  OVS_KEY_ATTR_TUNNEL
      3	OVS_TUNNEL_KEY_ATTR_ID
      4	OVS_TUNNEL_KEY_ATTR_IPV4_SRC
      5	OVS_TUNNEL_KEY_ATTR_IPV4_DST
      6	OVS_TUNNEL_KEY_ATTR_TTL
      7	OVS_TUNNEL_KEY_ATTR_TP_SRC
      8	OVS_TUNNEL_KEY_ATTR_TP_DST
      9  OVS_KEY_ATTR_IN_PORT
      10 OVS_KEY_ATTR_SKB_MARK
      11 OVS_KEY_ATTR_MPLS
      
      we switch to the 'ovs_tunnel_key_lens' table on attribute #3,
      and we don't switch back to 'ovs_key_lens' while setting
      attributes #9 to #11 in the sequence. As OVS_KEY_ATTR_MPLS
      evaluates to 21, and the array size of 'ovs_tunnel_key_lens' is
      15, we also get this kind of KASan splat while accessing the
      wrong table:
      
      [ 7654.586496] ==================================================================
      [ 7654.594573] BUG: KASAN: global-out-of-bounds in nlattr_set+0x164/0xde9 [openvswitch]
      [ 7654.603214] Read of size 4 at addr ffffffffc169ecf0 by task handler29/87430
      [ 7654.610983]
      [ 7654.612644] CPU: 21 PID: 87430 Comm: handler29 Kdump: loaded Not tainted 3.10.0-866.el7.test.x86_64 #1
      [ 7654.623030] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
      [ 7654.631379] Call Trace:
      [ 7654.634108]  [<ffffffffb65a7c50>] dump_stack+0x19/0x1b
      [ 7654.639843]  [<ffffffffb53ff373>] print_address_description+0x33/0x290
      [ 7654.647129]  [<ffffffffc169b37b>] ? nlattr_set+0x164/0xde9 [openvswitch]
      [ 7654.654607]  [<ffffffffb53ff812>] kasan_report.part.3+0x242/0x330
      [ 7654.661406]  [<ffffffffb53ff9b4>] __asan_report_load4_noabort+0x34/0x40
      [ 7654.668789]  [<ffffffffc169b37b>] nlattr_set+0x164/0xde9 [openvswitch]
      [ 7654.676076]  [<ffffffffc167ef68>] ovs_nla_get_match+0x10c8/0x1900 [openvswitch]
      [ 7654.684234]  [<ffffffffb61e9cc8>] ? genl_rcv+0x28/0x40
      [ 7654.689968]  [<ffffffffb61e7733>] ? netlink_unicast+0x3f3/0x590
      [ 7654.696574]  [<ffffffffc167dea0>] ? ovs_nla_put_tunnel_info+0xb0/0xb0 [openvswitch]
      [ 7654.705122]  [<ffffffffb4f41b50>] ? unwind_get_return_address+0xb0/0xb0
      [ 7654.712503]  [<ffffffffb65d9355>] ? system_call_fastpath+0x1c/0x21
      [ 7654.719401]  [<ffffffffb4f41d79>] ? update_stack_state+0x229/0x370
      [ 7654.726298]  [<ffffffffb4f41d79>] ? update_stack_state+0x229/0x370
      [ 7654.733195]  [<ffffffffb53fe4b5>] ? kasan_unpoison_shadow+0x35/0x50
      [ 7654.740187]  [<ffffffffb53fe62a>] ? kasan_kmalloc+0xaa/0xe0
      [ 7654.746406]  [<ffffffffb53fec32>] ? kasan_slab_alloc+0x12/0x20
      [ 7654.752914]  [<ffffffffb53fe711>] ? memset+0x31/0x40
      [ 7654.758456]  [<ffffffffc165bf92>] ovs_flow_cmd_new+0x2b2/0xf00 [openvswitch]
      
      [snip]
      
      [ 7655.132484] The buggy address belongs to the variable:
      [ 7655.138226]  ovs_tunnel_key_lens+0xf0/0xffffffffffffd400 [openvswitch]
      [ 7655.145507]
      [ 7655.147166] Memory state around the buggy address:
      [ 7655.152514]  ffffffffc169eb80: 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa fa
      [ 7655.160585]  ffffffffc169ec00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [ 7655.168644] >ffffffffc169ec80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa fa
      [ 7655.176701]                                                              ^
      [ 7655.184372]  ffffffffc169ed00: fa fa fa fa 00 00 00 00 fa fa fa fa 00 00 00 05
      [ 7655.192431]  ffffffffc169ed80: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
      [ 7655.200490] ==================================================================
      Reported-by: NHangbin Liu <liuhangbin@gmail.com>
      Fixes: 982b5270 ("openvswitch: Fix mask generation for nested attributes.")
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72f17baf
  24. 26 1月, 2018 1 次提交
  25. 19 1月, 2018 1 次提交
  26. 16 1月, 2018 1 次提交
  27. 16 12月, 2017 1 次提交
  28. 27 11月, 2017 1 次提交
    • Z
      openvswitch: fix the incorrect flow action alloc size · 67c8d22a
      zhangliping 提交于
      If we want to add a datapath flow, which has more than 500 vxlan outputs'
      action, we will get the following error reports:
        openvswitch: netlink: Flow action size 32832 bytes exceeds max
        openvswitch: netlink: Flow action size 32832 bytes exceeds max
        openvswitch: netlink: Actions may not be safe on all matching packets
        ... ...
      
      It seems that we can simply enlarge the MAX_ACTIONS_BUFSIZE to fix it, but
      this is not the root cause. For example, for a vxlan output action, we need
      about 60 bytes for the nlattr, but after it is converted to the flow
      action, it only occupies 24 bytes. This means that we can still support
      more than 1000 vxlan output actions for a single datapath flow under the
      the current 32k max limitation.
      
      So even if the nla_len(attr) is larger than MAX_ACTIONS_BUFSIZE, we
      shouldn't report EINVAL and keep it move on, as the judgement can be
      done by the reserve_sfa_size.
      Signed-off-by: Nzhangliping <zhangliping02@baidu.com>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67c8d22a
  29. 14 11月, 2017 1 次提交
  30. 13 11月, 2017 1 次提交
  31. 08 11月, 2017 1 次提交
    • Y
      openvswitch: enable NSH support · b2d0f5d5
      Yi Yang 提交于
      v16->17
       - Fixed disputed check code: keep them in nsh_push and nsh_pop
         but also add them in __ovs_nla_copy_actions
      
      v15->v16
       - Add csum recalculation for nsh_push, nsh_pop and set_nsh
         pointed out by Pravin
       - Move nsh key into the union with ipv4 and ipv6 and add
         check for nsh key in match_validate pointed out by Pravin
       - Add nsh check in validate_set and __ovs_nla_copy_actions
      
      v14->v15
       - Check size in nsh_hdr_from_nlattr
       - Fixed four small issues pointed out By Jiri and Eric
      
      v13->v14
       - Rename skb_push_nsh to nsh_push per Dave's comment
       - Rename skb_pop_nsh to nsh_pop per Dave's comment
      
      v12->v13
       - Fix NSH header length check in set_nsh
      
      v11->v12
       - Fix missing changes old comments pointed out
       - Fix new comments for v11
      
      v10->v11
       - Fix the left three disputable comments for v9
         but not fixed in v10.
      
      v9->v10
       - Change struct ovs_key_nsh to
             struct ovs_nsh_key_base base;
             __be32 context[NSH_MD1_CONTEXT_SIZE];
       - Fix new comments for v9
      
      v8->v9
       - Fix build error reported by daily intel build
         because nsh module isn't selected by openvswitch
      
      v7->v8
       - Rework nested value and mask for OVS_KEY_ATTR_NSH
       - Change pop_nsh to adapt to nsh kernel module
       - Fix many issues per comments from Jiri Benc
      
      v6->v7
       - Remove NSH GSO patches in v6 because Jiri Benc
         reworked it as another patch series and they have
         been merged.
       - Change it to adapt to nsh kernel module added by NSH
         GSO patch series
      
      v5->v6
       - Fix the rest comments for v4.
       - Add NSH GSO support for VxLAN-gpe + NSH and
         Eth + NSH.
      
      v4->v5
       - Fix many comments by Jiri Benc and Eric Garver
         for v4.
      
      v3->v4
       - Add new NSH match field ttl
       - Update NSH header to the latest format
         which will be final format and won't change
         per its author's confirmation.
       - Fix comments for v3.
      
      v2->v3
       - Change OVS_KEY_ATTR_NSH to nested key to handle
         length-fixed attributes and length-variable
         attriubte more flexibly.
       - Remove struct ovs_action_push_nsh completely
       - Add code to handle nested attribute for SET_MASKED
       - Change PUSH_NSH to use the nested OVS_KEY_ATTR_NSH
         to transfer NSH header data.
       - Fix comments and coding style issues by Jiri and Eric
      
      v1->v2
       - Change encap_nsh and decap_nsh to push_nsh and pop_nsh
       - Dynamically allocate struct ovs_action_push_nsh for
         length-variable metadata.
      
      OVS master and 2.8 branch has merged NSH userspace
      patch series, this patch is to enable NSH support
      in kernel data path in order that OVS can support
      NSH in compat mode by porting this.
      Signed-off-by: NYi Yang <yi.y.yang@intel.com>
      Acked-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NEric Garver <e@erig.me>
      Acked-by: NPravin Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b2d0f5d5
  32. 11 10月, 2017 1 次提交
  33. 10 10月, 2017 1 次提交
  34. 12 8月, 2017 1 次提交
  35. 25 6月, 2017 1 次提交
    • J
      net: store port/representator id in metadata_dst · 3fcece12
      Jakub Kicinski 提交于
      Switches and modern SR-IOV enabled NICs may multiplex traffic from Port
      representators and control messages over single set of hardware queues.
      Control messages and muxed traffic may need ordered delivery.
      
      Those requirements make it hard to comfortably use TC infrastructure today
      unless we have a way of attaching metadata to skbs at the upper device.
      Because single set of queues is used for many netdevs stopping TC/sched
      queues of all of them reliably is impossible and lower device has to
      retreat to returning NETDEV_TX_BUSY and usually has to take extra locks on
      the fastpath.
      
      This patch attempts to enable port/representative devs to attach metadata
      to skbs which carry port id.  This way representatives can be queueless and
      all queuing can be performed at the lower netdev in the usual way.
      
      Traffic arriving on the port/representative interfaces will be have
      metadata attached and will subsequently be queued to the lower device for
      transmission.  The lower device should recognize the metadata and translate
      it to HW specific format which is most likely either a special header
      inserted before the network headers or descriptor/metadata fields.
      
      Metadata is associated with the lower device by storing the netdev pointer
      along with port id so that if TC decides to redirect or mirror the new
      netdev will not try to interpret it.
      
      This is mostly for SR-IOV devices since switches don't have lower netdevs
      today.
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NSridhar Samudrala <sridhar.samudrala@intel.com>
      Signed-off-by: NSimon Horman <horms@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3fcece12
  36. 14 4月, 2017 1 次提交
  37. 23 3月, 2017 1 次提交
    • A
      openvswitch: Optimize sample action for the clone use cases · 798c1661
      andy zhou 提交于
      With the introduction of open flow 'clone' action, the OVS user space
      can now translate the 'clone' action into kernel datapath 'sample'
      action, with 100% probability, to ensure that the clone semantics,
      which is that the packet seen by the clone action is the same as the
      packet seen by the action after clone, is faithfully carried out
      in the datapath.
      
      While the sample action in the datpath has the matching semantics,
      its implementation is only optimized for its original use.
      Specifically, there are two limitation: First, there is a 3 level of
      nesting restriction, enforced at the flow downloading time. This
      limit turns out to be too restrictive for the 'clone' use case.
      Second, the implementation avoid recursive call only if the sample
      action list has a single userspace action.
      
      The main optimization implemented in this series removes the static
      nesting limit check, instead, implement the run time recursion limit
      check, and recursion avoidance similar to that of the 'recirc' action.
      This optimization solve both #1 and #2 issues above.
      
      One related optimization attempts to avoid copying flow key as
      long as the actions enclosed does not change the flow key. The
      detection is performed only once at the flow downloading time.
      
      Another related optimization is to rewrite the action list
      at flow downloading time in order to save the fast path from parsing
      the sample action list in its original form repeatedly.
      Signed-off-by: NAndy Zhou <azhou@ovn.org>
      Acked-by: NPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      798c1661
  38. 17 3月, 2017 1 次提交