1. 20 9月, 2014 3 次提交
    • I
      net/mlx4_en: Add mlx4_en_get_cqe helper · b1b6b4da
      Ido Shamay 提交于
      This function derives the base address of the CQE from the CQE size,
      and calculates the real CQE context segment in it from the factor
      (this is like before). Before this change the code used the factor to
      calculate the base address of the CQE as well.
      
      The factor indicates in which segment of the cqe stride the cqe information
      is located. For 32-byte strides, the segment is 0, and for 64 byte strides,
      the segment is 1 (bytes 32..63). Using the factor was ok as long as we had
      only 32 and 64 byte strides. However, with larger strides, the factor is zero,
      and so cannot be used to calculate the base of the CQE.
      
      The helper uses the same method of CQE buffer pulling made by all other
      components that reads the CQE buffer (mlx4_ib driver and libmlx4).
      Signed-off-by: NIdo Shamay <idos@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1b6b4da
    • I
      net/mlx4_core: Cache line EQE size support · 43c816c6
      Ido Shamay 提交于
      Enable mlx4 interrupt handler to work with EQE stride feature,
      The feature may be enabled when cache line is bigger than 64B.
      The EQE size will then be the cache line size, and the context
      segment resides in [0-31] offset.
      Signed-off-by: NIdo Shamay <idos@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      43c816c6
    • I
      net/mlx4_core: Enable CQE/EQE stride support · 77507aa2
      Ido Shamay 提交于
      This feature is intended for archs having cache line larger then 64B.
      
      Since our CQE/EQEs are generally 64B in those systems, HW will write
      twice to the same cache line consecutively, causing pipe locks due to
      he hazard prevention mechanism. For elements in a cyclic buffer, writes
      are consecutive, so entries smaller than a cache line should be
      avoided, especially if they are written at a high rate.
      
      Reduce consecutive writes to same cache line in CQs/EQs, by allowing the
      driver to increase the distance between entries so that each will reside
      in a different cache line. Until the introduction of this feature, there
      were two types of CQE/EQE:
      
      1. 32B stride and context in the [0-31] segment
      2. 64B stride and context in the [32-63] segment
      
      This feature introduces two additional types:
      
      3. 128B stride and context in the [0-31] segment (128B cache line)
      4. 256B stride and context in the [0-31] segment (256B cache line)
      
      Modify the mlx4_core driver to query the device for the CQE/EQE cache
      line stride capability and to enable that capability when the host
      cache line size is larger than 64 bytes (supported cache lines are
      128B and 256B).
      
      The mlx4 IB driver and libmlx4 need not be aware of this change. The PF
      context behaviour is changed to require this change in VF drivers
      running on such archs.
      Signed-off-by: NIdo Shamay <idos@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      77507aa2
  2. 06 9月, 2014 1 次提交
  3. 04 9月, 2014 1 次提交
  4. 30 8月, 2014 2 次提交
  5. 13 8月, 2014 1 次提交
  6. 05 8月, 2014 1 次提交
    • J
      mlx4_core: Add support for secure-host and SMP firewall · 114840c3
      Jack Morgenstein 提交于
      Secure-host is the general term for the capability of a device
      to protect itself and the subnet from malicious host software.
      
      This is achieved by:
      1. Not allowing un-trusted entities to access device configuration
         registers, directly (through pci_cr or pci_conf) and indirectly
         (through MADs).
      
      2. Hiding M_Key from untrusted entities.
      
      3. Preventing the modification of GUID0 by un-trusted entities
      
      4. Not allowing drivers on untrusted hosts to receive nor to transmit
         packets over QP0 (SMP Firewall).
      
      The secure-host capability depends on firmware handling all QP0
      packets, and not passing these packets up to the driver. Any information
      required by the driver for proper operation (e.g., SM lid) is passed
      via events generated by the firmware while processing QP0 MADs.
      
      Driver support mainly requires using the MAD_DEMUX FW command at startup,
      where the feature is enabled/disabled through a procedure described in
      the Mellanox HCA tools package.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      
      [ Fix error path in mlx4_setup_hca to go to err_mcg_table_free. - Roland ]
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      114840c3
  7. 02 8月, 2014 1 次提交
  8. 25 7月, 2014 1 次提交
  9. 23 7月, 2014 4 次提交
  10. 17 7月, 2014 6 次提交
  11. 15 7月, 2014 1 次提交
    • J
      mlx4: mark napi id for gro_skb · 32b333fe
      Jason Wang 提交于
      Napi id was not marked for gro_skb, this will lead rx busy loop won't
      work correctly since they stack never try to call low latency receive
      method because of a zero socket napi id. Fix this by marking napi id
      for gro_skb.
      
      The transaction rate of 1 byte netperf tcp_rr gets about 50% increased
      (from 20531.68 to 30610.88).
      
      Cc: Amir Vadai <amirv@mellanox.com>
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32b333fe
  12. 09 7月, 2014 7 次提交
  13. 08 7月, 2014 1 次提交
  14. 03 7月, 2014 2 次提交
  15. 23 6月, 2014 1 次提交
  16. 12 6月, 2014 1 次提交
    • Y
      net/mlx4_en: Use affinity hint · 9e311e77
      Yuval Atias 提交于
      The “affinity hint” mechanism is used by the user space
      daemon, irqbalancer, to indicate a preferred CPU mask for irqs.
      Irqbalancer can use this hint to balance the irqs between the
      cpus indicated by the mask.
      
      We wish the HCA to preferentially map the IRQs it uses to numa cores
      close to it.  To accomplish this, we use cpumask_set_cpu_local_first(), that
      sets the affinity hint according the following policy:
      First it maps IRQs to “close” numa cores.  If these are exhausted, the
      remaining IRQs are mapped to “far” numa cores.
      Signed-off-by: NYuval Atias <yuvala@mellanox.com>
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9e311e77
  17. 11 6月, 2014 2 次提交
    • W
      net/mlx4_core: Keep only one driver entry release mlx4_priv · da1de8df
      Wei Yang 提交于
      Following commit befdf897 "net/mlx4_core: Preserve pci_dev_data after
      __mlx4_remove_one()", there are two mlx4 pci callbacks which will
      attempt to release the mlx4_priv object -- .shutdown and .remove.
      
      This leads to a use-after-free access to the already freed mlx4_priv
      instance and trigger a "Kernel access of bad area" crash when both
      .shutdown and .remove are called.
      
      During reboot or kexec, .shutdown is called, with the VFs probed to
      the host going through shutdown first and then the PF. Later, the PF
      will trigger VFs' .remove since VFs still have driver attached.
      
      Fix that by keeping only one driver entry which releases mlx4_priv.
      
      Fixes: befdf897 ('net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()')
      CC: Bjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NWei Yang <weiyang@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      da1de8df
    • J
      net/mlx4_core: Fix SRIOV free-pool management when enforcing resource quotas · 95646373
      Jack Morgenstein 提交于
      The Hypervisor driver tracks free slots and reserved slots at the global level
      and tracks allocated slots and guaranteed slots per VF.
      
      Guaranteed slots are treated as reserved by the driver, so the total
      reserved slots is the sum of all guaranteed slots over all the VFs.
      
      As VFs allocate resources, free (global) is decremented and allocated (per VF)
      is incremented for those resources. However, reserved (global) is never changed.
      
      This means that effectively, when a VF allocates a resource from its
      guaranteed pool, it is actually reducing that resource's free pool (since
      the global reserved count was not also reduced).
      
      The fix for this problem is the following: For each resource, as long as a
      VF's allocated count is <= its guaranteed number, when allocating for that
      VF, the reserved count (global) should be reduced by the allocation as well.
      
      When the global reserved count reaches zero, the remaining global free count
      is still accessible as the free pool for that resource.
      
      When the VF frees resources, the reverse happens: the global reserved count
      for a resource is incremented only once the VFs allocated number falls below
      its guaranteed number.
      
      This fix was developed by Rick Kready <kready@us.ibm.com>
      Reported-by: NRick Kready <kready@us.ibm.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95646373
  18. 07 6月, 2014 1 次提交
  19. 05 6月, 2014 1 次提交
  20. 03 6月, 2014 2 次提交