1. 03 7月, 2017 1 次提交
  2. 08 6月, 2017 1 次提交
  3. 31 1月, 2017 3 次提交
  4. 30 10月, 2016 2 次提交
    • J
      net/mlx4_en: Fix potential deadlock in port statistics flow · d2582a03
      Jack Morgenstein 提交于
      mlx4_en_DUMP_ETH_STATS took the *counter mutex* and then
      called the FW command, with WRAPPED attribute. As a result, the fw command
      is wrapped on the Hypervisor when it calls mlx4_en_DUMP_ETH_STATS.
      The FW command wrapper flow on the hypervisor takes the *slave_cmd_mutex*
      during processing.
      
      At the same time, a VF could be in the process of coming up, and could
      call mlx4_QUERY_FUNC_CAP.  On the hypervisor, the command flow takes the
      *slave_cmd_mutex*, then executes mlx4_QUERY_FUNC_CAP_wrapper.
      mlx4_QUERY_FUNC_CAP wrapper calls mlx4_get_default_counter_index(),
      which takes the *counter mutex*. DEADLOCK.
      
      The fix is that the DUMP_ETH_STATS fw command should be called with
      the NATIVE attribute, so that on the hypervisor, this command does not
      enter the wrapper flow.
      
      Since the Hypervisor no longer goes through the wrapper code, we also
      simply return 0 in mlx4_DUMP_ETH_STATS_wrapper (i.e.the function succeeds,
      but the returned data will be all zeroes).
      No need to test if it is the Hypervisor going through the wrapper.
      
      Fixes: f9baff50 ("mlx4_core: Add "native" argument to mlx4_cmd ...")
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2582a03
    • J
      net/mlx4_core: Fix the resource-type enum in res tracker to conform to FW spec · aa0c08fe
      Jack Morgenstein 提交于
      The resource type enum in the resource tracker was incorrect.
      RES_EQ was put in the position of RES_NPORT_ID (a FC resource).
      
      Since the remaining resources maintain their current values,
      and RES_EQ is not passed from slaves to the hypervisor in any
      FW command, this change affects only the hypervisor.
      Therefore, there is no backwards-compatibility issue.
      
      Fixes: 623ed84b ("mlx4_core: initial header-file changes for SRIOV support")
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aa0c08fe
  5. 24 9月, 2016 2 次提交
    • M
      net/mlx4: Add VF vlan protocol 802.1ad support · b42959dc
      Moshe Shemesh 提交于
      Move the vf to VST 802.1ad mode (mlx4 VST QinQ mode) by setting vf vlan
      protocol to 802.1ad.
      VST 802.1ad mode in mlx4, is used for STAG strip/insertion by PF, while
      the CTAG is set by the VF.
      Read current vlan protocol as part of the vf configuration state.
      
      Upon setting vf vlan protocol to 802.1ad, we use a mechanism of handshake
      to verify that both the vf and the pf driver version support it.
      The handshake uses the command QUERY_FUNC_CAP:
      - The vf sets a pre-defined support bit in input modifier.
      - A pf that supports the feature sends the request to the vf through a
        pre-defined field in the output mailbox.
      - In case vf does not support the feature, the pf will fail the control
        command (in this case, IP link tool command to set the vf vlan
        protocol to 802.1ad).
      
      No change in VST 802.1Q mode.
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b42959dc
    • M
      net/mlx4_core: Preparation for VF vlan protocol 802.1ad · 7c3d21c8
      Moshe Shemesh 提交于
      Check device capability to support VF vlan protocol 802.1ad mode.
      Add vport attribute vlan protocol.
      Init vport vlan protocol by default to 802.1Q.
      Add update QP support for VF vlan protocol 802.1ad.
      Add func capability vlan_offload_disable to disable all
      vlan HW acceleration on VF while the VF is set to VF vlan protocol
      802.1ad mode.
      No change in VF vlan protocol 802.1Q (VST) mode.
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c3d21c8
  6. 22 9月, 2016 1 次提交
    • J
      net/mlx4_core: Fix deadlock when switching between polling and event fw commands · a7e1f049
      Jack Morgenstein 提交于
      When switching from polling-based fw commands to event-based fw
      commands, there is a race condition which could cause a fw command
      in another task to hang: that task will keep waiting for the polling
      sempahore, but may never be able to acquire it. This is due to
      mlx4_cmd_use_events, which "down"s the sempahore back to 0.
      
      During driver initialization, this is not a problem, since no other
      tasks which invoke FW commands are active.
      
      However, there is a problem if the driver switches to polling mode
      and then back to event mode during normal operation.
      
      The "test_interrupts" feature does exactly that.
      Running "ethtool -t <eth device> offline" causes the PF driver to
      temporarily switch to polling mode, and then back to event mode.
      (Note that for VF drivers, such switching is not performed).
      
      Fix this by adding a read-write semaphore for protection when
      switching between modes.
      
      Fixes: 225c7b1f ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7e1f049
  7. 22 4月, 2016 1 次提交
  8. 02 3月, 2016 1 次提交
  9. 20 1月, 2016 1 次提交
  10. 07 12月, 2015 2 次提交
  11. 15 10月, 2015 1 次提交
    • J
      net/mlx4_core: Replace VF zero mac with random mac in mlx4_core · 2b3ddf27
      Jack Morgenstein 提交于
      By design, when no default MAC addresses are set in the Hypervisor for VFs,
      the VFs are passed zero-macs. When such a MAC is received by the VF, it
      generates a random MAC address and registers that MAC address
      with the Hypervisor.
      
      This random mac generation is currently done in the mlx4_en module.
      There is a problem, though, if the mlx4_ib module is loaded by a VF before
      the mlx4_en module. In this case, for RoCE, mlx4_ib will see the un-replaced
      zero-mac and register that zero-mac as part of QP1 initialization.
      
      Having a zero-mac in the port's MAC table creates problems for a
      Baseboard Management Console. The BMC occasionally sends packets with a
      zero-mac destination MAC. If there is a zero-mac present in the port's
      MAC table, the FW will send such BMC packets to the host driver rather than
      to the wire, and BMC will stop working.
      
      To address this problem, we move the replacement of zero-mac addresses
      with random-mac addresses to procedure mlx4_slave_cap(), which is part of the
      driver startup for VFs, and is before activation of mlx4_ib and mlx4_en.
      As a result, zero-mac addresses will never be registered in the port MAC table
      by the driver.
      
      In addition, when mlx4_en does initialize the net device, it needs to set
      the NET_ADDR_RANDOM flag in the netdev structure if the address was
      randomly generated. This is done so that udev on the VM does not create
      a new device name after each VF probe (VM boot and such). To accomplish this,
      we add a per-port flag in mlx4_dev which gets set whenever mlx4_core replaces
      a zero-mac with a randomly-generated mac. This flag is examined when mlx4_en
      initializes the net-device.
      
      Fix was suggested by Matan Barak <matanb@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2b3ddf27
  12. 28 7月, 2015 1 次提交
  13. 16 6月, 2015 3 次提交
  14. 31 5月, 2015 2 次提交
    • I
      net/mlx4_core: Move affinity hints to mlx4_core ownership · de161803
      Ido Shamay 提交于
      Now that EQs management is in the sole responsibility of mlx4_core,
      the IRQ affinity hints configuration should be in its hands as well.
      request_irq is called only once by the first consumer (maybe mlx4_ib),
      so mlx4_en passes the affinity mask too late. We also need to request
      vectors according to the cores we want to run on.
      
      mlx4_core distribution of IRQs to cores is straight forward,
      EQ(i)->IRQ will set affinity hint to core i.
      Consumers need to request EQ vectors, according to their cores
      considerations (NUMA).
      Signed-off-by: NIdo Shamay <idos@mellanox.com>
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      de161803
    • M
      net/mlx4: Add EQ pool · c66fa19c
      Matan Barak 提交于
      Previously, mlx4_en allocated EQs and used them exclusively.
      This affected RoCE performance, as applications which are
      events sensitive were limited to use only the legacy EQs.
      
      Change that by introducing an EQ pool. This pool is managed
      by mlx4_core. EQs are assigned to ports (when there are limited
      number of EQs, multiple ports could be assigned to the same EQs).
      
      An exception to this rule is the ASYNC EQ which handles various events.
      
      Legacy EQs are completely removed as all EQs could be shared.
      
      When a consumer (mlx4_ib/mlx4_en) requests an EQ, it asks for
      EQ serving on a specific port. The core driver calculates which
      EQ should be assigned to that request.
      
      Because IRQs are shared between IB and Ethernet modules, their
      names only include the PCI device BDF address.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NIdo Shamay <idos@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c66fa19c
  15. 16 4月, 2015 1 次提交
  16. 03 4月, 2015 4 次提交
  17. 09 3月, 2015 1 次提交
  18. 05 2月, 2015 1 次提交
    • M
      net/mlx4_core: Port aggregation upper layer interface · 53f33ae2
      Moni Shoua 提交于
      Supply interface functions to bond and unbond ports of a mlx4 internal
      interfaces. Example for such an interface is the one registered by the
      mlx4 IB driver under RoCE.
      
      There are
      
      1. Functions to go in/out to/from bonded mode
      2. Function to remap virtual ports to physical ports
      
      The bond_mutex prevents simultaneous access to data that keep status of
      the device in bonded mode.
      
      The upper mlx4 interface marks to the mlx4 core module that they
      want to be subject for such bonding by setting the MLX4_INTFF_BONDING
      flag. Interface which goes to/from bonded mode is re-created.
      
      The mlx4 Ethernet driver does not set this flag when registering the
      interface, the IB driver does.
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      53f33ae2
  19. 03 2月, 2015 1 次提交
  20. 28 1月, 2015 1 次提交
  21. 26 1月, 2015 6 次提交
  22. 12 12月, 2014 3 次提交
    • M
      net/mlx4: Add support for A0 steering · 7d077cd3
      Matan Barak 提交于
      Add the required firmware commands for A0 steering and a way to enable
      that. The firmware support focuses on INIT_HCA, QUERY_HCA, QUERY_PORT,
      QUERY_DEV_CAP and QUERY_FUNC_CAP commands. Those commands are used
      to configure and query the device.
      
      The different A0 DMFS (steering) modes are:
      
      Static - optimized performance, but flow steering rules are
      limited. This mode should be choosed explicitly by the user
      in order to be used.
      
      Dynamic - this mode should be explicitly choosed by the user.
      In this mode, the FW works in optimized steering mode as long as
      it can and afterwards automatically drops to classic (full) DMFS.
      
      Disable - this mode should be explicitly choosed by the user.
      The user instructs the system not to use optimized steering, even if
      the FW supports Dynamic A0 DMFS (and thus will be able to use optimized
      steering in Default A0 DMFS mode).
      
      Default - this mode is implicitly choosed. In this mode, if the FW
      supports Dynamic A0 DMFS, it'll work in this mode. Otherwise, it'll
      work at Disable A0 DMFS mode.
      
      Under SRIOV configuration, when the A0 steering mode is enabled,
      older guest VF drivers who aren't using the RX QP allocation flag
      (MLX4_RESERVE_A0_QP) will get a QP from the general range and
      fail when attempting to register a steering rule. To avoid that,
      the PF context behaviour is changed once on A0 static mode, to
      require support for the allocation flag in VF drivers too.
      
      In order to enable A0 steering, we use log_num_mgm_entry_size param.
      If the value of the parameter is not positive, we treat the absolute
      value of log_num_mgm_entry_size as a bit field. Setting bit 2 of this
      bit field enables static A0 steering.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d077cd3
    • M
      net/mlx4: Add A0 hybrid steering · d57febe1
      Matan Barak 提交于
      A0 hybrid steering is a form of high performance flow steering.
      By using this mode, mlx4 cards use a fast limited table based steering,
      in order to enable fast steering of unicast packets to a QP.
      
      In order to implement A0 hybrid steering we allocate resources
      from different zones:
      (1) General range
      (2) Special MAC-assigned QPs [RSS, Raw-Ethernet] each has its own region.
      
      When we create a rss QP or a raw ethernet (A0 steerable and BF ready) QP,
      we try hard to allocate the QP from range (2). Otherwise, we try hard not
      to allocate from this  range. However, when the system is pushed to its
      limits and one needs every resource, the allocator uses every region it can.
      
      Meaning, when we run out of raw-eth qps, the allocator allocates from the
      general range (and the special-A0 area is no longer active). If we run out
      of RSS qps, the mechanism tries to allocate from the raw-eth QP zone. If that
      is also exhausted, the allocator will allocate from the general range
      (and the A0 region is no longer active).
      
      Note that if a raw-eth qp is allocated from the general range, it attempts
      to allocate the range such that bits 6 and 7 (blueflame bits) in the
      QP number are not set.
      
      When the feature is used in SRIOV, the VF has to notify the PF what
      kind of QP attributes it needs. In order to do that, along with the
      "Eth QP blueflame" bit, we reserve a new "A0 steerable QP". According
      to the combination of these bits, the PF tries to allocate a suitable QP.
      
      In order to maintain backward compatibility (with older PFs), the PF
      notifies which QP attributes it supports via QUERY_FUNC_CAP command.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d57febe1
    • M
      net/mlx4: Add mlx4_bitmap zone allocator · 7a89399f
      Matan Barak 提交于
      The zone allocator is a mechanism which manages a few mlx4_bitmaps.
      
      When allocating a resource, the user indicates the desired zone of
      which this resource will be allocated from. If possible, the resource
      will be allocated from this zone. Otherwise, the resource will be
      allocated from a less-than, equal-to, higher-than priority zone,
      according to the desired zone's properties with that respective
      allocation order.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a89399f