1. 02 5月, 2016 13 次提交
  2. 30 4月, 2016 27 次提交
    • D
      Merge branch 'mlx5-aRFS' · 3cfef195
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      Mellanox 100G mlx5 ethernet aRFS support
      
      This series adds accelerated RFS support for the mlx5e driver.
      I have added one patch non-related to aRFS that fixes the rtnl_lock
      warning mlx5 driver been getting since b7aade15 ('vxlan: break dependency with netdev drivers')
      
      aRFS support in details:
      
      A direct TIR per RQ is now required in order to have the essential building blocks
      for aRFS.  Today the driver has one direct TIR that forwards traffic to RQ[0] (core 0),
      and one indirect TIR for RSS indirection table.  For that we've added one direct TIR
      per RQ, e.g.: TIR[i] -> RQ[i] (core i).
      
      Publicize Modify flow rule destination and reveal it in flow steering API, to have the
      ability to dynamically modify the destination TIR(core) for aRFS rules from the
      ethernet driver.
      
      Initializing CPU reverse mapping to notify upper layer on internal receive queue cpu
      mappings.
      
      Some design refactoring for mlx5e ethernet driver flow tables and flow steering API.
      Now the caller of create_flow_table can choose the level of the flow table, this way
      we will create the mlx5e flow tables in a reversed order and connect them as we go,
      we create flow table[i+1] before flow table[i] to be able to set flow table[i + 1] as
      a destination of flow table[i] once flow table[i] is created.
      also we have split the main flow table in the following manner:
          - From before: RX packet had to visit two flow tables until it is delivered to its receive queue:
              RX packet -> vlan filter flow table -> main flow table.
              > vlan filter will check the packet vlan field is allowed.
              > main flow will check if the dest mac is allowed and will check the l3/l4 headers to
              retrieve the RSS hash for steering the packet into its final receive queue.
      
          - Now main flow table is split into l2 dst mac steering table and ttc (traffic type classifier) table:
              RX packet -> vlan filter -> l2 table -> ttc table
              > vlan filter - same as before
              > L2 filter - filter packets according their destination mac address
              > ttc table - classify packet headers for RSS steering
                  - L3/L4 classification rules to steer the packet according to thier headers hash
                  - in case of none of the rules applies the packet is steered to RQ[0]
      
      After the above refactoring all left to-do is to create aRFS flow table which will manage
      aRFS steering rules to forward traffic to the desired RQ (core) and just connect the ttc
      table rules destinations to aRFS flow table.
      
      aRFS flow table in case of a miss will deliver the traffic to the core where the original
      ttc hash would have chosen.
      
      TTC table is not initialized and enabled until the user explicitly asks to, i.e. setting the NETIF_F_NTUPLE
      to ON.  This way there is no need for ttc table to forward traffic to aRFS table unless required.
      When setting back to OFF aRFS flow table is disabled and disconnected.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3cfef195
    • M
      net/mlx5e: Enabling aRFS mechanism · 45bf454a
      Maor Gottlieb 提交于
      Accelerated RFS requires that ntuple filtering is enabled via
      ethtool and driver supports ndo_rx_flow_steer.
      When the ntuple filtering is enabled, we modify the l3_l4 ttc
      rules to point on the aRFS flow tables and when the filtering
      is disabled, we modify the l3_l4 ttc rules to point on the RSS
      TIRs.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      45bf454a
    • M
      net/mlx5e: Add accelerated RFS support · 18c908e4
      Maor Gottlieb 提交于
      Implement ndo_rx_flow_steer ndo.
      A new flow steering rule will be composed from the
      skb 4-tuple and added to the hardware aRFS flow table.
      
      Each rule is stored in an internal hash table, if such
      skb 4-tuple rule already exists we update the corresponding
      hardware steering rule with the new destination.
      
      For garbage collection rps_may_expire_flow will be
      invoked for a limited amount of old rules upon any
      ndo_rx_flow_steer invocation.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      18c908e4
    • M
      net/mlx5e: Create aRFS flow tables · 1cabe6b0
      Maor Gottlieb 提交于
      Create the following four flow tables for aRFS usage:
      1. IPv4 TCP - filtering 4-tuple of IPv4 TCP packets.
      2. IPv6 TCP - filtering 4-tuple of IPv6 TCP packets.
      3. IPv4 UDP - filtering 4-tuple of IPv4 UDP packets.
      4. IPv6 UDP - filtering 4-tuple of IPv6 UDP packets.
      
      Each flow table has two flow groups: one for the 4-tuple
      filtering (full match)  and the other contains * rule for miss rule.
      
      Full match rule means a hit for aRFS and packet will be forwarded
      to the dedicated RQ/Core, miss rule packets will be forwarded to
      default RSS hashing.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1cabe6b0
    • M
      net/mlx5: Initializing CPU reverse mapping · 5a7b27eb
      Maor Gottlieb 提交于
      Allocating CPU rmap and add entry for each IRQ.
      CPU rmap is used in aRFS to get the RX queue number
      of the RX completion interrupts.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a7b27eb
    • M
      net/mlx5e: Split the main flow steering table · 33cfaaa8
      Maor Gottlieb 提交于
      Currently, the main flow table is used for two purposes:
      One is to do mac filtering and the other is to classify
      the packet l3-l4 header in order to steer the packet to
      the right RSS TIR.
      
      This design is very complex, for each configured mac address we
      have to add eleven rules (rule for each traffic type), the same if the
      device is put to promiscuous/allmulti mode.
      This scheme isn't scalable for future features like aRFS.
      
      In order to simplify it, the main flow table is split to two flow
      tables:
      1. l2 table - filter the packet dmac address, if there is a match
      we forward to the ttc flow table.
      
      2. TTC (Traffic Type Classifier) table - classify the traffic
      type of the packet and steer the packet to the right TIR.
      
      In this new design, when new mac address is added, the driver adds
      only one flow rule instead of eleven.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      33cfaaa8
    • M
      net/mlx5e: Refactor mlx5e flow steering structs · acff797c
      Maor Gottlieb 提交于
      Slightly refactor and re-order the flow steering structs,
      tables and data-bases for better self-containment and
      flexibility to add more future steering phases
      (tables/rules/data bases) e.g: aRFS.
      
      Changes:
      1. Move the vlan DB and address DB into their table structs.
      2. Rename steering table structs to unique format: mlx5e_*_table,
      e.g: mlx5e_vlan_table.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      acff797c
    • M
      net/mlx5: Support different attributes for priorities in namespace · 13de6c10
      Maor Gottlieb 提交于
      Currently, namespace could be initialized only
      with priorities with the same attributes.
      Add support to initialize namespace with priorities
      with different attributes(e.g. different number of levels).
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13de6c10
    • M
      net/mlx5: Add user chosen levels when allocating flow tables · d63cd286
      Maor Gottlieb 提交于
      Currently, consumers of the flow steering infrastructure can't
      choose their own flow table levels and are limited to one
      flow table per level. This just waste levels.
      Instead, we introduce here the possibility to use multiple
      flow tables in a level. The user is free to connect these
      flow tables, while following the rule (FTEs in FT of level x
      could only point to FTs of level y where y > x).
      
      In addition this patch switch the order of the create/destroy
      flow tables of the NIC(vlan and main).
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d63cd286
    • M
      net/mlx5: Set number of allowed levels in priority · a257b94a
      Maor Gottlieb 提交于
      Refactors the flow steering namespace creation,
      by changing the name num_fts to num_levels.
      When new flow table is created, the driver assign new level
      to this flow table therefore the meaning is equivalent.
      Since downstream patches will introduce the ability to create more
      than one flow table per level, the name num_fts is no
      longer accurate.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a257b94a
    • M
      net/mlx5: Introduce modify flow rule destination · d745098c
      Maor Gottlieb 提交于
      This API is used for modifying the flow rule destination.
      This is needed for modifying the pointed flow table by the
      traffic type classifier rules to point on the aRFS tables.
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d745098c
    • T
      net/mlx5e: Direct TIR per RQ · 1da36696
      Tariq Toukan 提交于
      Introduce new TIRs for direct access per RQ.
      Now we have 2 available kinds of TIRs:
      	- indirect TIR per traffic type, each points to one RQT (RSS RQT)
                same as before.
      	- New direct TIR per RQ, each points to RQT with a size of one
                that forwards packets to that RQ only.
      
      Driver will open max channels (num cores) direct TIRs by default,
      they will be filled with the actual RQs once channels are allocated.
      
      Needed for downstream aRFS and ethtool direct steering functionalities.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1da36696
    • M
      net/mlx5e: Call vxlan_get_rx_port() with rtnl lock · 01a14098
      Matthew Finlay 提交于
      Hold the rtnl lock when calling vxlan_get_rx_port().
      
      Fixes: b7aade15 ("vxlan: break dependency with netdev drivers")
      Signed-off-by: NMatthew Finlay <matt@mellanox.com>
      Reported-by: NAlexander Duyck <alexander.duyck@gmail.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      01a14098
    • D
      Merge branch 'enc28j60-small-improvements' · 4b2523c1
      David S. Miller 提交于
      Michael Heimpold says:
      
      ====================
      net: ethernet: enc28j60: small improvements
      
      This series of two patches adds the following improvements to the driver:
      
      1) Rework the central SPI read function so that it is compatible with
         SPI masters which only support half duplex transfers.
      
      2) Add a device tree binding for the driver.
      
      Changelog:
      
      v3: * renamed and improved binding documentation as
            suggested by Rob Herring
      
      v2: * took care of Arnd Bergmann's review comments
            - allow to specify MAC address via DT
            - unconditionally define DT id table
          * increased the driver version minor number
          * driver author's email address bounces, removed from address list
      
      v1: * Initial submission
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b2523c1
    • M
      net: ethernet: enc28j60: add device tree support · 2dd355a0
      Michael Heimpold 提交于
      The following patch adds the required match table for device tree support
      (and while at, fix the indent). It's also possible to specify the
      MAC address in the DT blob.
      
      Also add the corresponding binding documentation file.
      Signed-off-by: NMichael Heimpold <mhei@heimpold.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2dd355a0
    • M
      net: ethernet: enc28j60: support half-duplex SPI controllers · 2957a28a
      Michael Heimpold 提交于
      The current spi_read_buf function fails on SPI host masters which
      are only half-duplex capable. Splitting the Tx and Rx part solves
      this issue.
      
      Tested on Raspberry Pi (full duplex) and I2SE Duckbill (half duplex).
      Signed-off-by: NMichael Heimpold <mhei@heimpold.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2957a28a
    • N
      net: constify is_skb_forwardable's arguments · f4b05d27
      Nikolay Aleksandrov 提交于
      is_skb_forwardable is not supposed to change anything so constify its
      arguments
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4b05d27
    • D
      Merge branch 'ppp-rtnetlink' · 92aff96a
      David S. Miller 提交于
      Guillaume Nault says:
      
      ====================
      ppp: add rtnetlink support
      
      PPP devices lack the ability to be customised at creation time. In
      particular they can't be created in a given netns or with a particular
      name. Moving or renaming the device after creation is possible, but
      creates undesirable transient effects on servers where PPP devices are
      constantly created and removed, as users connect and disconnect.
      Implementing rtnetlink support solves this problem.
      
      The rtnetlink handlers implemented in this series are minimal, and can
      only replace the PPPIOCNEWUNIT ioctl. The rest of PPP ioctls remains
      necessary for any other operation on channels and units.
      It is perfectly possible to mix PPP devices created by rtnl
      and by ioctl(PPPIOCNEWUNIT). Devices will behave in the same way.
      
      mutex_trylock() is used to resolve the locking issue wrt. locking
      dependency between rtnl_lock() and ppp_mutex (see ppp_nl_newlink() in
      patch #2).
      
      A user visible difference brought by this series is that old PPP
      interfaces (those created with ioctl(PPPIOCNEWUNIT)), can now be
      removed by "ip link del", just like new rtnl based PPP devices.
      
      Changes since v3:
        - Rebase on net-next.
        - Not an RFC anymore.
      
      Changes since v2:
        - Define ->rtnl_link_ops for ioctl based PPP devices, so they can
          handle rtnl messages just like rtnl based ones (suggested by
          Stephen Hemminger).
        - Move back to original lock ordering between ppp_mutex and rtnl_lock
          to simplify patch series. Handle lock inversion issue using
          mutex_trylock() (suggested by Stephen Hemminger).
        - Do file descriptor lookup directly in ppp_nl_newlink(), to simplify
          ppp_dev_configure().
      
      Changes since v1:
        - Rebase on net-next.
        - Invert locking order wrt. ppp_mutex and rtnl_lock and protect
          file->private_data with ppp_mutex.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92aff96a
    • G
      ppp: add rtnetlink device creation support · 96d934c7
      Guillaume Nault 提交于
      Define PPP device handler for use with rtnetlink.
      The only PPP specific attribute is IFLA_PPP_DEV_FD. It is mandatory and
      contains the file descriptor of the associated /dev/ppp instance (the
      file descriptor which would have been used for ioctl(PPPIOCNEWUNIT) in
      the ioctl-based API). The PPP device is removed when this file
      descriptor is released (same behaviour as with ioctl based PPP
      devices).
      
      PPP devices created with the rtnetlink API behave like the ones created
      with ioctl(PPPIOCNEWUNIT). In particular existing ioctls work the same
      way, no matter how the PPP device was created.
      The rtnl callbacks are also assigned to ioctl based PPP devices. This
      way, rtnl messages have the same effect on any PPP devices.
      The immediate effect is that all PPP devices, even ioctl-based
      ones, can now be removed with "ip link del".
      
      A minor difference still exists between ioctl and rtnl based PPP
      interfaces: in the device name, the number following the "ppp" prefix
      corresponds to the PPP unit number for ioctl based devices, while it is
      just an unrelated incrementing index for rtnl ones.
      Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      96d934c7
    • G
      ppp: define reusable device creation functions · 7d9f0b48
      Guillaume Nault 提交于
      Move PPP device initialisation and registration out of
      ppp_create_interface().
      This prepares code for device registration with rtnetlink.
      
      While there, simplify the prototype of ppp_create_interface():
      
        * Since ppp_dev_configure() takes care of setting file->private_data,
          there's no need to return a ppp structure to ppp_unattached_ioctl()
          anymore.
      
        * The unit parameter is made read/write so that ppp_create_interface()
          can tell which unit number has been assigned.
      Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d9f0b48
    • A
      net: ethernet: stmmac: update MDIO support for GMAC4 · ac1f74a7
      Alexandre TORGUE 提交于
      On new GMAC4 IP, MAC_MDIO_address register has been updated, and bitmaps
      changed. This patch takes into account those changes.
      Signed-off-by: NAlexandre TORGUE <alexandre.torgue@st.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac1f74a7
    • J
      vxlan: fix initialization with custom link parameters · 65226ef8
      Jiri Benc 提交于
      Commit 0c867c9b ("vxlan: move Ethernet initialization to a separate
      function") changed initialization order and as an unintended result, when the
      user specifies additional link parameters (such as IFLA_ADDRESS) while
      creating vxlan interface, those are overwritten by vxlan_ether_setup later.
      
      It's necessary to call ether_setup from withing the ->setup callback. That
      way, the correct parameters are set by rtnl_create_link later. This is done
      also for VXLAN-GPE, as we don't know the interface type yet at that point,
      and changed to the correct interface type later.
      
      Fixes: 0c867c9b ("vxlan: move Ethernet initialization to a separate function")
      Reported-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NJiri Benc <jbenc@redhat.com>
      Tested-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65226ef8
    • D
      Merge branch 'samples-bpf-user-experience' · 638af178
      David S. Miller 提交于
      Jesper Dangaard Brouer says:
      
      ====================
      samples/bpf: Improve user experience
      
      It is a steep learning curve getting started with using the eBPF
      examples in samples/bpf/.  There are several dependencies, and
      specific versions of these dependencies.  Invoking make in the correct
      manor is also slightly obscure.
      
      This patchset cleanup, document and hopefully improves the first time
      user experience with the eBPF samples directory by auto-detecting
      certain scenarios.
      
      V4:
       - Address Naveen's nitpicks
       - Handle/fail if extra args are passed in LLC or CLANG (David Laight)
      
      V3:
       - Add Alexei's ACKs
       - Remove README paragraph about LLVM experimental BPF target
         as it only existed between LLVM version 3.6 to 3.7.
      
      V2:
       - Adjusted recommend minimum versions to 3.7.1
       - Included clang build instructions
       - New patch adding CLANG variable and validation of command
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      638af178
    • J
      samples/bpf: like LLC also verify and allow redefining CLANG command · bdefbbf2
      Jesper Dangaard Brouer 提交于
      Users are likely to manually compile both LLVM 'llc' and 'clang'
      tools.  Thus, also allow redefining CLANG and verify command exist.
      
      Makefile implementation wise, the target that verify the command have
      been generalized.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bdefbbf2
    • J
      samples/bpf: allow make to be run from samples/bpf/ directory · b62a796c
      Jesper Dangaard Brouer 提交于
      It is not intuitive that 'make' must be run from the top level
      directory with argument "samples/bpf/" to compile these eBPF samples.
      
      Introduce a kbuild make file trick that allow make to be run from the
      "samples/bpf/" directory itself.  It basically change to the top level
      directory and call "make samples/bpf/" with the "/" slash after the
      directory name.
      
      Also add a clean target that only cleans this directory, by taking
      advantage of the kbuild external module setting M=$PWD.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b62a796c
    • J
      samples/bpf: add a README file to get users started · 1c97566d
      Jesper Dangaard Brouer 提交于
      Getting started with using examples in samples/bpf/ is not
      straightforward.  There are several dependencies, and specific
      versions of these dependencies.
      
      Just compiling the example tool is also slightly obscure, e.g. one
      need to call make like:
      
       make samples/bpf/
      
      Do notice the "/" slash after the directory name.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1c97566d
    • J
      samples/bpf: Makefile verify LLVM compiler avail and bpf target is supported · 7b01dd57
      Jesper Dangaard Brouer 提交于
      Make compiling samples/bpf more user friendly, by detecting if LLVM
      compiler tool 'llc' is available, and also detect if the 'bpf' target
      is available in this version of LLVM.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b01dd57