1. 26 7月, 2012 1 次提交
    • K
      mlx4: Add support for EEH error recovery · 57dbf29a
      Kleber Sacilotto de Souza 提交于
      Currently the mlx4 drivers don't have the necessary callbacks to
      implement EEH errors detection and recovery, so the PCI layer uses the
      probe and remove callbacks to try to recover the device after an error on
      the bus. However, these callbacks have race conditions with the internal
      catastrophic error recovery functions, which will also detect the error
      and this can cause the system to crash if both EEH and catas functions
      try to reset the device.
      
      This patch adds the necessary error recovery callbacks and makes sure
      that the internal catastrophic error functions will not try to reset the
      device in such scenarios. It also adds some calls to
      pci_channel_offline() to suppress reads/writes on the bus when the slot
      cannot accept I/O operations so we prevent unnecessary accesses to the
      bus and speed up the device removal.
      Signed-off-by: NKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
      Acked-by: NShlomo Pongratz <shlomop@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      57dbf29a
  2. 20 7月, 2012 1 次提交
    • T
      mlx4_en: map entire pages to increase throughput · 4cce66cd
      Thadeu Lima de Souza Cascardo 提交于
      In its receive path, mlx4_en driver maps each page chunk that it pushes
      to the hardware and unmaps it when pushing it up the stack. This limits
      throughput to about 3Gbps on a Power7 8-core machine.
      
      One solution is to map the entire allocated page at once. However, this
      requires that we keep track of every page fragment we give to a
      descriptor. We also need to work with the discipline that all fragments will
      be released (in the sense that it will not be reused by the driver
      anymore) in the order they are allocated to the driver.
      
      This requires that we don't reuse any fragments, every single one of
      them must be reallocated. We do that by releasing all the fragments that
      are processed and only after finished processing the descriptors, we
      start the refill.
      
      We also must somehow guarantee that we either refill all fragments in a
      descriptor or none at all, without resorting to giving up a page
      fragment that we would have already given. Otherwise, we would break the
      discipline of only releasing the fragments in the order they were
      allocated.
      
      This has passed page allocation fault injections (restricted to the
      driver by using required-start and required-end) and device hotplug
      while 16 TCP streams were able to deliver more than 9Gbps.
      Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cce66cd
  3. 19 7月, 2012 3 次提交
  4. 17 7月, 2012 2 次提交
  5. 08 7月, 2012 10 次提交
  6. 05 7月, 2012 1 次提交
  7. 26 6月, 2012 3 次提交
  8. 07 6月, 2012 2 次提交
    • J
      mlx4_core: Fix setting VL_cap in mlx4_SET_PORT wrapper flow · edc4a67e
      Jack Morgenstein 提交于
      Commit 096335b3 ("mlx4_core: Allow dynamic MTU configuration for
      IB ports") modifies the port VL setting.  This exposes a bug in
      mlx4_common_set_port(), where the VL cap value passed in (inside the
      command mailbox) is incorrectly zeroed-out:
      
      mlx4_SET_PORT modifies the VL_cap field (byte 3 of the mailbox).
      Since the SET_PORT command is paravirtualized on the master as well as
      on the slaves, mlx4_SET_PORT_wrapper() is invoked on the master.  This
      calls mlx4_common_set_port() where mailbox byte 3 gets overwritten by
      code which should only set a single bit in that byte (for the reset
      qkey counter flag) -- but instead overwrites the entire byte.
      
      The result is that when running in SR-IOV mode, the VL_cap will be set
      to zero -- fix this.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      edc4a67e
    • J
      ethernet: Remove casts to same type · 64699336
      Joe Perches 提交于
      Adding casts of objects to the same type is unnecessary
      and confusing for a human reader.
      
      For example, this cast:
      
              int y;
              int *p = (int *)&y;
      
      I used the coccinelle script below to find and remove these
      unnecessary casts.  I manually removed the conversions this
      script produces of casts with __force, __iomem and __user.
      
      @@
      type T;
      T *p;
      @@
      
      -       (T *)p
      +       p
      
      A function in atl1e_main.c was passed a const pointer
      when it actually modified elements of the structure.
      
      Change the argument to a non-const pointer.
      
      A function in stmmac needed a __force to avoid a sparse
      warning.  Added it.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64699336
  9. 01 6月, 2012 6 次提交
  10. 18 5月, 2012 1 次提交
    • A
      net/mlx4_en: num cores tx rings for every UP · bc6a4744
      Amir Vadai 提交于
      Change the TX ring scheme such that the number of rings for untagged packets
      and for tagged packets (per each of the vlan priorities) is the same, unlike
      the current situation where for tagged traffic there's one ring per priority
      and for untagged rings as the number of core.
      
      Queue selection is done as follows:
      
      If the mqprio qdisc is operates on the interface, such that the core networking
      code invoked the device setup_tc ndo callback, a mapping of skb->priority =>
      queue set is forced - for both, tagged and untagged traffic.
      
      Else, the egress map skb->priority =>  User priority is used for tagged traffic, and
      all untagged traffic is sent through tx rings of UP 0.
      
      The patch follows the convergence of discussing that issue with John Fastabend
      over this thread http://comments.gmane.org/gmane.linux.network/229877
      
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Cc: Liran Liss <liranl@mellanox.com>
      Signed-off-by: NAmir Vadai <amirv@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc6a4744
  11. 16 5月, 2012 8 次提交
  12. 15 5月, 2012 1 次提交
    • J
      mlx4_core: Change bitmap allocator to work in round-robin fashion · f4ec9e95
      Jack Morgenstein 提交于
      Under most circumstances, the bitmap allocator does not allocate the
      same full 24-bit QP number immediately after a QP is destroyed.
      
      This works by using the upper bits of a 24-bit QP number, beyond the
      number of QPs that are actually available in the low level driver.
      For example, say that the HCA is willing to allocate a maximum of 64K
      qps.  We use the bits 23..16 as a "counter" which is incremented by 1
      at each allocation so that even if the same physical QP is
      re-allocated, it will not receive the same 24-bit QP number.
      
      However, we have seen the following scenario:
      1. Allocate, say, 255 QPs in succession.  This will cause a wrap of the "counter".
      2. Destroy the first QP allocated, then allocate a new QP.  The new QP,
         because of the counter wraparound, will get the same FULL QP number as
         the QP just destroyed!
      
      This is a problem because packets in transit can be erroneously
      delivered to the new QP when they were meant for the old (destroyed)
      QP, because the full QP number of the new QP is identical to the
      destroyed QP.  (The "counter" mechanism is meant to prevent this by
      having the full 24-bit QP numbers differ even if the physical QP on
      the HCA is the same.  As we see above, however, this mechanism does
      not always work).
      
      The best fix for this problem is to allocate QPs in round-robin mode,
      so that the physical QP numbers are not immediately re-used.
      Found-by: NMatthew Finlay <matt@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      f4ec9e95
  13. 09 5月, 2012 1 次提交
    • S
      mlx4_core: Add second capabilities flags field · b3416f44
      Shlomo Pongratz 提交于
      This patch adds a 64-bit flags2 features member to struct mlx4_dev to
      export further features of the hardware.  The original flags field
      tracks features whose support bits are advertised by the firmware in
      offsets 0x40 and 0x44 of the query device capabilities command.
      flags2 will track features whose support bits are scattered at various
      offsets.
      
      RSS support is the first feature to be exported through flags2.  RSS
      capabilities are located at offset 0x2e.  The size of the RSS
      indirection table is also given in this offset.
      Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      b3416f44