1. 12 1月, 2016 16 次提交
    • D
      Merge branch 'mlx5-enhanced-flow-steering' · 7937963a
      David S. Miller 提交于
      Or Gerlitz says:
      
      ====================
      net/mlx5_core: Enhance flow steering support
      
      v0 --> v1 changes:
        - fixed improperly formatted comments.
        - compare value of ib_spec->eth.mask.ether_type in network byte order
           in ('IB/mlx5: Add flow steering utilities').
      
      v1 --> v2 changes:
        - made sure that service functions added in the IB driver are only static-fied
          on the last commit, to make sure bisection with -Werror works fine.
      
      v2 --> v3 changes:
         - squashed patches 11 and 12 into one patch, s.t Dave's comment
           on unused static functions gcc complaints during bisection is
           correctly addressed.
      
      v3 has been generated against net-next commit c9c99311 "Merge tag
      'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge"
      
      The series is signed by Matan who was revently assigned to a maintainer for
      the mlx5_core and IB drivers (this is a 4.5-rc1 change to the maintainers file coming
      from the rdma tree) -- as such I didn't see a neeed to add my signature (Or).
      
      This series adds three new functionalists to the driver flow-steering
      infrastructure: auto-grouped flow tables, chaining of flow tables and
      updates for the root flow table.
      
      1. Auto-grouped flow tables - Flow table with auto grouping management.
      When a flow table is created, hints regarding the number of rule types
      and the number of rules are given in advance. Thus, a flow table is
      divided into #NUM_TYPES+1 groups each contains
      (#NUM_RULES)/(#NUM_TYPES+1) rules. The first #NUM_TYPES parts are groups
      which are filled if the added rule matches the group specification or
      the group is empty. The last part is filled by rules that can't fit
      any of the former groups.
      
      2. Chaining flow tables - Flow tables from different priorities are chained
      together, if there is no match in flow table of priority i we continue
      searching for a match in priority i+1. This is both true if priorities
      i and i+1 belongs to the same namespace or not.
      
      3. Updating the root flow table - the root flow table is the flow table
      with the lowest level. The hardware start searching for a match in the
      root flow table and continue according to the matches it find along
      the way.
      
      The first usage for the new functionality is flow steering for user-space
      ConnectX-4 offloaded HW Eth RX queues done through the mlx5 IB driver.
      
      When the mlx5 core driver is loaded, it opens three flow namespaces:
      1. By-pass namespace (used by mlx5 IB driver).
      2. Kernel namespace (used in order to get packets to the networking stack
      through mlx5 EN driver).
      3. Leftovers namespace (used by mlx5 IB and future sniffer)
      
      The series is built as follows:
      
      Patch #1 introduces auto-grouped flow tables support.
      
      Patch #2 add utility functions for finding the next and the previous
      flow tables in different priorities. This is used in order to chain
      the flow tables in a downstream patch.
      
      Patch #3 introduces a firmware command for updating the root flow table.
      
      Patch #4 introduces modify flow table firmware command, this command is used
      when we want to change the next flow table of an existing flow table.
      This is used for chaining flow tables as well.
      
      Patch #5 connect/disconnect flow tables. This is actually the chaining
      process when we want to link flow tables. This means that if we couldn't
      find a match in the first flow table, we'll continue in the chained
      flow table.
      
      Patch #6 updates priority's attributes that is required for flow table
      level allocation. We update both the max_fts (the number of allowed FTs
      in the sub-tree of this priority) and the start_level (which is the first
      level we'll assign to the flow-tables created inside the priority).
      
      Patch #7 adds checking of required device capabilities. Some namespaces
      could be only created if the hardware supports certain attributes.
      This is especially true for the Bypass and leftovers namespaces. This
      adds a generic mechanism to check these required attributes.
      
      Patch #8 creates two additional namespaces:
      	a. Bypass flow rules(has nine priorities)
      	b. Leftovers packets(have one priority) - for unmatched packets.
      
      Patch #9 re-factors ipv4/ipv6 match fields in the mlx5 firmware interface
      header to be more clear.
      
      Patch #10 exports the flow steering API for mlx5_ib usage
      
      Patch #11 implements the required support in mlx5_ib in order
      to support the RDMA flow steering verbs.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7937963a
    • M
      IB/mlx5: Add flow steering support · 038d2ef8
      Maor Gottlieb 提交于
      Adding flow steering support by creating a flow-table per
      priority (if rules exist in the priority). mlx5_ib uses
      autogrouping and thus only creates the required destinations.
      
      Also includes adding of these flow steering utilities
      
      1. Parsing verbs flow attributes hardware steering specs.
      
      2. Check if flow is multicast - this is required in order to decide
      to which flow table will we add the steering rule.
      
      3. Set outer headers in flow match criteria to zeros.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      038d2ef8
    • M
      net/mlx5_core: Export flow steering API · b217ea25
      Maor Gottlieb 提交于
      Add exports to flow steering API for mlx5_ib usage.
      The following functions are exported:
      
      1. mlx5_create_auto_grouped_flow_table - used to create flow
      table with auto flow grouping management (create and destroy
      flow groups). In auto-grouped flow tables, we create groups
      automatically if needed (if we don't find an existing
      flow group with same match criteria when we add new rule).
      
      2. mlx5_destroy_flow_table - used to destroy  a flow table.
      
      3. mlx5_add_flow_rule - used to add flow rule into a flow table.
      
      4. mlx5_del_flow_rule - used to delete flow rule from its flow table.
      
      5. mlx5_get_flow_namespace - used to get a handle to the required
      namespace sub-tree.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b217ea25
    • M
      net/mlx5_core: Make ipv4/ipv6 location more clear · b4d1f032
      Maor Gottlieb 提交于
      Change the mlx5 firmware interface header to make it
      more clear which bytes should be used by IPv4 or
      IPv6 addresses.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b4d1f032
    • M
      net/mlx5_core: Enable flow steering support for the IB driver · 4cbdd30e
      Maor Gottlieb 提交于
      When the driver is loaded, we create flow steering namespace
      for kernel bypass with nine priorities and another namespace
      for leftovers(in order to catch packets that weren't matched).
      Verbs applications will use these priorities.
      we found nine as a number that balances the requirements from the
      user and retains performance.
      
      The bypass namespace is used by verbs applications that want to bypass
      the kernel networking stack. The leftovers namespace is used by verbs
      applications and the sniffer in order to catch packets that weren't
      handled by any preceding rules.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cbdd30e
    • M
      net/mlx5_core: Initialize namespaces only when supported by device · 8d40d162
      Maor Gottlieb 提交于
      Before we create the sub tree of a steering namespaces(kernel, bypass,
      leftovers) we check that the device has the required capabilities
      in order to create this subtree.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8d40d162
    • M
      net/mlx5_core: Set priority attributes · 655227ed
      Maor Gottlieb 提交于
      Each priority has two attributes:
      1. max_ft - maximum allowed flow tables under this priority.
      2. start_level - start level range of the flow tables
      in the priority.
      
      These attributes are set by traversing the tree nodes by
      DFS and set start level and max flow tables to each priority.
      Start level depends on the max flow tables of the prior priorities
      in the tree.
      
      The leaves of the trees have max_ft set in them. Each node accumulates
      the max_ft of its children and set it accordingly.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      655227ed
    • M
      net/mlx5_core: Connect flow tables · f90edfd2
      Maor Gottlieb 提交于
      Flow tables from different priorities should be chained together.
      When a packet arrives we search for a match in the
      by-pass flow tables (first we search for a match in priority 0
      and if we don't find a match we move to the next priority).
      If we can't find a match in any of the bypass flow-tables, we continue
      searching in the flow-tables of the next priority, which are the
      kernel's flow tables.
      
      Setting the miss flow table in a new flow table to be the next one in
      the list is performed via create flow table API. If we want to change an
      existing flow table, for example in order to point from an
      existing flow table to the new next-in-list flow table, we use the
      modify flow table API.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f90edfd2
    • M
      net/mlx5_core: Introduce modify flow table command · 34a40e68
      Maor Gottlieb 提交于
      Introduce the modify flow table command. This command is used when
      we want to change the next flow table of an existing flow table.
      The next flow table is defined as the table we search (in order
      to find a match), if we couldn't find a match in any of the flow table
      entries in the current flow table.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      34a40e68
    • M
      net/mlx5_core: Managing root flow table · 2cc43b49
      Maor Gottlieb 提交于
      The root Flow Table for each Flow Table Type is defined,
      by default, as the Flow Table with level 0.
      
      In order not to use an empty flow tables and introduce new hops,
      but still preserve space for flow-tables that have a priority
      greater(lower number) than the current flow table, we introduce this
      new set root flow table command.
      This command tells the HW to start matching packets from the
      assigned root flow table.
      This command is used when we create new flow table with level lower than the
      current lowest flow table or it is the first flow table.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2cc43b49
    • M
      net/mlx5_core: Add utilities to find next and prev flow-tables · fdb6896f
      Maor Gottlieb 提交于
      Add two utility functions for find next and prev flow table.
      Find next flow table function gets priority and return the
      first flow table of the next priority in the tree.
      Find prev flow table return the last flow table of
      the previous priority in the tree.
      
      These utility functions are used for chaining flow table from different
      priorities.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fdb6896f
    • M
      net/mlx5_core: Introduce flow steering autogrouped flow table · f0d22d18
      Maor Gottlieb 提交于
      When user add rule to autogrouped flow table, we search
      for flow group with the same match criteria, if we don't
      find such group then we create new flow group with the
      required match criteria and insert the rule to this group.
      
      We divide the flow table into required_groups + 1,
      in order to reserve a part of the flow table for rules
      which don't match any existing group.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f0d22d18
    • D
      Merge branch 'bpf-next' · 23c09c26
      David S. Miller 提交于
      Daniel Borkmann says:
      
      ====================
      BPF update
      
      This set adds IPv6 support for bpf_skb_{set,get}_tunnel_key() helper.
      It also exports flags to user space that are being used in helpers and
      weren't exported thus far. For more details, please see the individual
      patches.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      23c09c26
    • D
      bpf: support ipv6 for bpf_skb_{set,get}_tunnel_key · c6c33454
      Daniel Borkmann 提交于
      After IPv6 support has recently been added to metadata dst and related
      encaps, add support for populating/reading it from an eBPF program.
      
      Commit d3aa45ce ("bpf: add helpers to access tunnel metadata") started
      with initial IPv4-only support back then (due to IPv6 metadata support
      not being available yet).
      
      To stay compatible with older programs, we need to test for the passed
      structure size. Also TOS and TTL support from the ip_tunnel_info key has
      been added. Tested with vxlan devs in collect meta data mode with IPv4,
      IPv6 and in compat mode over different network namespaces.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6c33454
    • D
      bpf: export helper function flags and reject invalid ones · 781c53bc
      Daniel Borkmann 提交于
      Export flags used by eBPF helper functions through UAPI, so they can be
      used by programs (instead of them redefining all flags each time or just
      using the hard-coded values). It also gives a better overview what flags
      are used where and we can further get rid of the extra macros defined in
      filter.c. Moreover, reject invalid flags.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      781c53bc
    • J
      bonding: make mii_status sysfs node consistent · c8086f6d
      Jarod Wilson 提交于
      The spew in /proc/net/bonding/bond0 uses netif_carrier_ok() to determine
      mii_status, while /sys/class/net/bond0/bonding/mii_status looks at
      curr_active_slave, which doesn't actually seem to be set sometimes when
      the bond actually is up. A mode 4 bond configured via ifcfg-foo files on a
      Red Hat Enterprise Linux system, after boot, comes up clean and
      functional, but the sysfs node shows mii_status of down, while proc shows
      up. A simple enough fix here seems to be to use the same method for
      determining up or down in both places, and I'd opt for the one that seems
      to match reality.
      
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <gospo@cumulusnetworks.com>
      CC: netdev@vger.kernel.org
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8086f6d
  2. 11 1月, 2016 24 次提交