1. 12 5月, 2016 35 次提交
    • D
      Merge branch 'qed-sriov' · 48899291
      David S. Miller 提交于
      Yuval Mintz says:
      
      ====================
      qed*: Add SR-IOV support
      
      This patch adds SR-IOV support to qed/qede drivers, adding a new PCI
      device ID for a VF that is shared between all the various PFs that
      support IOV.
      
      This is quite a massive series - the first 7 parts of the series add
      the infrastructure of supporting vfs in qed - mainly adding support in a
      HW-based vf<->pf channel, as well as diverging all existing configuration
      flows based on the pf/vf decision. I.e., while PF-originated requests
      head directly to HW/FW, the VF requests first have to traverse to the PF
      which will perform the configuration.
      
      The 8th patch is the one that adds the support for the VF device in qede.
      
      The remaining 6 patches each adds some user-based API support related to
      VFs that can be used over the PF - forcing mac/vlan, changing speed, etc.
      
      Dave,
      
      Sorry in advance for the length of the series. Most of the bulk here is in
      the infrastructure patches that have to go together [or at least, it makes
      little sense to try splitting them up].
      
      Please consider applying this to `net-next'.
      
      Thanks,
      Yuval
      
      Changes from previous revision:
      ------------------------------
       - V2 - Replace aligned_u64 with regular u64; This was possible as the
              shared structures [between PF and VF] were already sufficiently
              padded as-is in the API, making this redundant.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      48899291
    • Y
      qed*: Tx-switching configuration · 831bfb0e
      Yuval Mintz 提交于
      Device should be configured by default to VEB once VFs are active.
      This changes the configuration of both PFs' and VFs' vports into enabling
      tx-switching once sriov is enabled.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      831bfb0e
    • Y
      qed*: support ndo_get_vf_config · 73390ac9
      Yuval Mintz 提交于
      Allows the user to view the VF configuration by observing the PF's
      device.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      73390ac9
    • Y
      qed*: IOV support spoof-checking · 6ddc7608
      Yuval Mintz 提交于
      Add support in `ndo_set_vf_spoofchk' for allowing PF control over
      its VF spoof-checking configuration.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ddc7608
    • Y
      qed*: IOV link control · 733def6a
      Yuval Mintz 提交于
      This adds support in 2 ndo that allow PF to tweak the VF's view of the
      link - `ndo_set_vf_link_state' to allow it a view independent of the PF's,
      and `ndo_set_vf_rate' which would allow the PF to limit the VF speed.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      733def6a
    • Y
      qed*: Support forced MAC · eff16960
      Yuval Mintz 提交于
      Allows the PF to enforce the VF's mac.
      i.e., by using `ip link ... vf <x> mac <value>'.
      
      While a MAC is forced, PF would prevent the VF from configuring any other
      MAC.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eff16960
    • Y
      qed*: Support PVID configuration · 08feecd7
      Yuval Mintz 提交于
      This adds support for PF control over the VF vlan configuration.
      I.e., `ip link ... vf <x> vlan <vid>' should now be supported.
      
       1. <vid> != 0 => VF receives [unknowingly] only traffic tagged by
          <vid> and tags all outgoing traffic sent by VF with <vid>.
       2. <vid> == 0 ==> Remove the pvid configuration, reverting to previous.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      08feecd7
    • Y
      qede: Add VF support · fefb0202
      Yuval Mintz 提交于
      Adding a PCI callback for `sriov_configure' and a new PCI device id for
      the VF [+ Some minor changes to accomodate differences between PF and VF
      at the qede].
      Following this, VF creation should be possible and the entire subset of
      existing PF functionality that's allow to VFs should be supported.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fefb0202
    • Y
      qed: Align TLVs · 17b235c1
      Yuval Mintz 提交于
      As the VF infrastructure is supposed to offer backward/forward
      compatibility, the various types associated with VF<->PF communication
      should be aligned across all various platforms that support IOV
      on our family of adapters.
      
      This adds a couple of currently missing values, specifically aligning
      the enum for the various TLVs possible in the communication between them.
      
      It then adds the PF implementation for some of those missing VF requests.
      This support isn't really necessary for the Linux VF as those VFs aren't
      requiring it [at least today], but are required by VFs running on other
      OSes. LRO is an example of one such configuration.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      17b235c1
    • Y
      qed: Bulletin and Link · 36558c3d
      Yuval Mintz 提交于
      Up to this point, VF and PF communication always originates from VF.
      As a result, VF cannot be notified of any async changes, and specifically
      cannot be informed of the current link state.
      
      This introduces the bulletin board, the mechanism through which the PF
      is going to communicate async notifications back to the VF. basically,
      it's a well-defined structure agreed by both PF and VF which the VF would
      continuously poll and into which the PF would DMA messages when needed.
      [Bulletin board is actually allocated and communicated in previous patches
      but never before used]
      
      Based on the bulletin infrastructure, the VF can query its link status
      and receive said async carrier changes.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36558c3d
    • Y
      qed: IOV l2 functionality · dacd88d6
      Yuval Mintz 提交于
      This adds sufficient changes to allow VFs l2-configuration flows to work.
      
      While the fastpath of the VF and the PF are meant to be exactly the same,
      the configuration of the VF is done by the PF.
      This diverges all VF-related configuration flows that originate from a VF,
      making them pass through the VF->PF channel and adding sufficient logic
      on the PF side to support them.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dacd88d6
    • Y
      qed: IOV configure and FLR · 0b55e27d
      Yuval Mintz 提交于
      While previous patches have already added the necessary logic to probe
      VFs as well as enabling them in the HW, this patch adds the ability to
      support VF FLR & SRIOV disable.
      
      It then wraps both flows together into the first IOV callback to be
      provided to the protocol driver - `configure'. This would later to be used
      to enable and disable SRIOV in the adapter.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b55e27d
    • Y
      qed: Introduce VFs · 1408cc1f
      Yuval Mintz 提交于
      This adds the qed VFs for the first time -
      The vfs are limited functions, with a very different PCI bar structure
      [when compared with PFs] to better impose the related security demands
      associated with them.
      
      This patch includes the logic neccesary to allow VFs to successfully probe
      [without actually adding the ability to enable iov].
      This includes diverging all the flows that would occur as part of the pci
      probe of the driver, preventing VF from accessing registers/memories it
      can't and instead utilize the VF->PF channel to query the PF for needed
      information.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1408cc1f
    • Y
      qed: Add VF->PF channel infrastructure · 37bff2b9
      Yuval Mintz 提交于
      Communication between VF and PF is based on a dedicated HW channel;
      VF will prepare a messge, and by signaling the HW the PF would get a
      notification of that message existance. The PF would then copy the
      message, process it and DMA an answer back to the VF as a response.
      
      The messages themselves are TLV-based - allowing easier backward/forward
      compatibility.
      
      This patch adds the infrastructure of the channel on the PF side -
      starting with the arrival of the notification and ending with DMAing
      the response back to the VF.
      
      It also adds a dummy-response as reference, as it only lays the
      groundwork of the communication; it doesn't really add support of any
      actual messages.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37bff2b9
    • Y
      qed: Add CONFIG_QED_SRIOV · 32a47e72
      Yuval Mintz 提交于
      Add support for a new Kconfig option for qed* driver which would allow
      [eventually] the support in VFs.
      
      This patch adds the necessary logic in the PF to learn about the possible
      VFs it will have to support [Based on PCI configuration space and HW],
      and prepare a database with an entry per-VF as infrastructure for future
      interaction with said VFs.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32a47e72
    • D
      Merge tag 'nfc-next-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next · 631ad4a3
      David S. Miller 提交于
      Samuel Ortiz says:
      
      ====================
      NFC 4.7 pull request
      
      This is the first NFC pull request for 4.7. With this one we
      mainly have:
      
      - Support for NXP's pn532 NFC chipset. The pn532 is based on the same
        microcontroller as the pn533, but it talks to the host through i2c
        instead of USB. By separating the pn533 driver into core and PHY
        parts, we can not add the i2c layer and support the pn532 chipset.
      
      - Support for NCI's loopback mode. This is a testing mode where each
        packet received by the NFCC is sent back to the DH, allowing the
        host to test that the controller can receive and send data.
      
      - A few ACPI related fixes for the STMicro drivers, in order to match
        the device tree naming scheme.
      
      - A bunch of cleanups for the st-nci and the st21nfca STMicro drivers.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      631ad4a3
    • D
      Merge branch 'mlx5-next' · 6a47a570
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      Mellanox 100G mlx5 CQE compression
      
      Introducing ConnectX-4 CQE (Completion Queue Entry) compression feature
      for mlx5 etherent driver.
      
      CQE Compressing reduces PCI overhead by coalescing and compressing multiple CQEs into a
      single merged CQE.  Successful compressing improves message rate especially for small packet
      traffic.
      
      CQE Compressing in details:
      
      Instead of writing full CQEs to memory, multiple almost identical CQEs are merged and compressed.
      Information that is shared between the CQEs is written once, regardless of the number of
      compressed CQEs.  In addition, only the unique information (small amount of bytes compared to
      full CQE size) is written per CQE.
      
      CQE Compression Block:
      
      This block contains multiple compressed CQEs.  CQE Compression Block contains a single copy
      of CQEs properties which are shared between all the compressed CQEs (called Title, see below)
      and multiple mini CQEs (CQEs in compressed form).
      
      Title:
      
      The Title holds information which is shared between all the compressed CQEs in the CQE Compression
      Block.  In each Compression Block there is only a single Title regardless of the number
      of compressed CQEs.
      
      Mini CQE:
      
      A CQE in compressed form that holds some data needed to extract a single full CQE, for example
      8 Bytes instead of 64 Bytes.
      The shared information between all compressed CQEs, which belong to the same CQE Compression
      Block called Title, is written once, and only the unique information in each compressed
      CQE, for example 8 bytes, is written per compressed CQE, called mini CQE.
      
      Since CQE Compression can add overhead to the software (CPU),
      it will be only enabled on "weak/slow" PCI slots, where it can actually help.
      
      Applied on top: c047c3b1 ('netfilter: conntrack: remove uninitialized shadow variable')
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6a47a570
    • S
      net/mlx5e: Enable CQE compression when PCI is slower than link · b797a684
      Saeed Mahameed 提交于
      We turn the feature ON, only for servers with PCI BW < MAX LINK BW, as it
      helps reducing PCI pressure on weak PCI slots, but it adds some software
      overhead.
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b797a684
    • T
      net/mlx5e: Expand WQE stride when CQE compression is enabled · d9d9f156
      Tariq Toukan 提交于
      Make the MPWQE/Striding RQ default configuration dynamic and not
      statically set at compile time.  Now at driver load we set
      stride size and num strides dynamically.
      
      By default we use same values as before, but when CQE compression
      is enabled, we set larger stride size to benefit from CQE
      compression for larger packets.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9d9f156
    • T
      net/mlx5e: CQE compression · 7219ab34
      Tariq Toukan 提交于
      CQE compression feature is meant to save PCIe bandwidth by
      compressing few CQEs into smaller amount of bytes on PCIe.
      CQE compression can be selectively enabled per CQ.  By default
      is disabled for now and will be enabled later on.
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7219ab34
    • D
      Merge branch 'more-dsa-probing' · c1869d58
      David S. Miller 提交于
      Andrew Lunn says:
      
      ====================
      More enabler patches for DSA probing
      
      The complete set of patches for the reworked DSA probing is too big to
      post as once. These subset contains some enablers which are easy to
      review.
      
      Eventually, the Marvell driver will instantiate its own internal MDIO
      bus, rather than have the framework do it, thus allows devices on the
      bus to be listed in the device tree. Initialize the main mutex as soon
      as it is created, to avoid lifetime issues with the mdio bus.
      
      A previous patch renamed all the DSA probe functions to make room for
      a true device probe. However the recent merging of all the Marvell
      switch drivers resulted in mv88e6xxx going back to the old probe
      name. Rename it again, so we can have a driver probe function.
      
      Add minimum support for the Marvell switch driver to probe as an MDIO
      device, as well as an DSA driver. Later patches will then register
      this device with the new DSA core framework.
      
      Move the GPIO reset code out of the DSA code. Different drivers may
      need different reset mechanisms, e.g. via a reset controller for
      memory mapped devices. Don't clutter up the core with this. Let each
      driver implement what it needs.
      
      master_dev is no longer needed in the switch drivers, since they have
      access to a device pointer from the probe function. Remove it.
      
      Let the switch parse the eeprom length from its one device tree
      node. This is required with the new binding when the central DSA
      platform device no longer exists.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c1869d58
    • A
      dsa: mv88e6xxx: Handle eeprom-length property · f8cd8753
      Andrew Lunn 提交于
      A switch can export an attached EEPROM using the standard ethtool API.
      However the switch itself cannot determine the size of the EEPROM, and
      multiple sizes are allowed. Thus a device tree property is supported
      to indicate the length of the EEPROM. Parse this property during
      device probe, and implement a callback function to retrieve it.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f8cd8753
    • A
      dsa: Rename switch chip data to cd · ff04955c
      Andrew Lunn 提交于
      The dsa_switch structure contains a dsa_chip_data member called pd.
      However in the rest of the code, pd is used for dsa_platform_data.
      This is confusing. Rename it cd, which is already often used in dsa.c
      and slave.c for this data type.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff04955c
    • A
      dsa: Remove master_dev from switch structure · c33063d6
      Andrew Lunn 提交于
      The switch drivers only use the master_dev member for dev_info()
      messages.  Now that the device is passed to the old style probe, and
      new style drivers are probed as true linux drivers, this is no longer
      needed.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c33063d6
    • A
      dsa: Move gpio reset into switch driver · 52638f71
      Andrew Lunn 提交于
      Resetting the switch is something the driver does, not the framework.
      So move the parsing of this property into the driver.
      
      There are no in kernel users of this property, so moving it does not
      break anything. There is however a board which will make use of this
      property making its way into the kernel.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52638f71
    • A
      dsa: Add mdio device support to Marvell switches · 14c7b3c3
      Andrew Lunn 提交于
      Allow Marvell switches to be mdio devices. Currently the driver just
      allocate the private structure and detects what device is on the
      bus. Later patches will make them register with the DSA framework.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      14c7b3c3
    • A
      dsa: mv88e6xxx: Rename probe function to fit the normal pattern · fcdce7d0
      Andrew Lunn 提交于
      All other DSA drivers use _drv_ in there DSA probe function name, thus
      allowing for a true linux driver probe function to use the
      conventional name. Make mv88e6xxx fit this pattern.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fcdce7d0
    • A
      dsa: mv88e6xxx: Initialise the mutex as soon as it is created · b681957a
      Andrew Lunn 提交于
      By initialising immediately it, we don't run the danger of using it
      before it is initialised.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b681957a
    • V
      net: dsa: mv88e6xxx: add STU capability · cb9b9020
      Vivien Didelot 提交于
      Some switch models have a STU (per VLAN port state database). Add a new
      capability flag to switches info, instead of checking their family.
      
      Also if the 6165 family has an STU, it must have a VTU, so add the
      MV88E6XXX_FLAG_VTU to its family flags.
      Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb9b9020
    • V
      net: dsa: mv88e6xxx: abstract VTU/STU data access · 15d7d7d4
      Vivien Didelot 提交于
      Both VTU and STU operations use the same routine to access their
      (common) data registers, with a different offset.
      
      Add VTU and STU specific read and write functions to the data registers
      to abstract the required offset.
      Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      15d7d7d4
    • D
      Merge branch 'vrf-pktinfo' · c3f1010b
      David S. Miller 提交于
      David Ahern says:
      
      ====================
      net: vrf: Fixup PKTINFO to return enslaved device index
      
      Applications such as OSPF and BFD need the original ingress device not
      the VRF device; the latter can be derived from the former. To that end
      move the packet intercept from an rx handler that is invoked by
      __netif_receive_skb_core to the ipv4 and ipv6 receive processing.
      
      IPv6 already saves the skb_iif to the control buffer in ipv6_rcv. Since
      the skb->dev has not been switched the cb has the enslaved device. Make
      the same happen for IPv4 by adding the skb_iif to inet_skb_parm and set
      it in ipv4 code after clearing the skb control buffer similar to IPv6.
      From there the pktinfo can just pull it from cb with the PKTINFO_SKB_CB
      cast.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3f1010b
    • D
      net: original ingress device index in PKTINFO · 0b922b7a
      David Ahern 提交于
      Applications such as OSPF and BFD need the original ingress device not
      the VRF device; the latter can be derived from the former. To that end
      add the skb_iif to inet_skb_parm and set it in ipv4 code after clearing
      the skb control buffer similar to IPv6. From there the pktinfo can just
      pull it from cb with the PKTINFO_SKB_CB cast.
      
      The previous patch moving the skb->dev change to L3 means nothing else
      is needed for IPv6; it just works.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b922b7a
    • D
      net: l3mdev: Add hook in ip and ipv6 · 74b20582
      David Ahern 提交于
      Currently the VRF driver uses the rx_handler to switch the skb device
      to the VRF device. Switching the dev prior to the ip / ipv6 layer
      means the VRF driver has to duplicate IP/IPv6 processing which adds
      overhead and makes features such as retaining the ingress device index
      more complicated than necessary.
      
      This patch moves the hook to the L3 layer just after the first NF_HOOK
      for PRE_ROUTING. This location makes exposing the original ingress device
      trivial (next patch) and allows adding other NF_HOOKs to the VRF driver
      in the future.
      
      dev_queue_xmit_nit is exported so that the VRF driver can cycle the skb
      with the switched device through the packet taps to maintain current
      behavior (tcpdump can be used on either the vrf device or the enslaved
      devices).
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74b20582
    • N
      ipv6: fix 4in6 tunnel receive path · ca4aa976
      Nicolas Dichtel 提交于
      Protocol for 4in6 tunnel is IPPROTO_IPIP. This was wrongly changed by
      the last cleanup.
      
      CC: Tom Herbert <tom@herbertland.com>
      Fixes: 0d3c703a ("ipv6: Cleanup IPv6 tunnel receive path")
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ca4aa976
    • L
      tcp: replace cnt & rtt with struct in pkts_acked() · 756ee172
      Lawrence Brakmo 提交于
      Replace 2 arguments (cnt and rtt) in the congestion control modules'
      pkts_acked() function with a struct. This will allow adding more
      information without having to modify existing congestion control
      modules (tcp_nv in particular needs bytes in flight when packet
      was sent).
      
      As proposed by Neal Cardwell in his comments to the tcp_nv patch.
      Signed-off-by: NLawrence Brakmo <brakmo@fb.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      756ee172
  2. 11 5月, 2016 5 次提交