1. 27 10月, 2014 1 次提交
    • E
      net/mlx4_core: Call synchronize_irq() before freeing EQ buffer · bf1bac5b
      Eli Cohen 提交于
      After moving the EQ ownership to software effectively destroying it, call
      synchronize_irq() to ensure that any handler routines running on other CPU
      cores finish execution. Only then free the EQ buffer.
      The same thing is done when we destroy a CQ which is one of the sources
      generating interrupts. In the case of CQ we want to avoid completion handlers
      on a CQ that was destroyed. In the case we do the same to avoid receiving
      asynchronous events after the EQ has been destroyed and its buffers freed.
      Signed-off-by: NEli Cohen <eli@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bf1bac5b
  2. 20 9月, 2014 1 次提交
  3. 03 7月, 2014 1 次提交
  4. 02 6月, 2014 2 次提交
  5. 15 5月, 2014 1 次提交
  6. 09 5月, 2014 1 次提交
  7. 21 3月, 2014 1 次提交
    • M
      net/mlx4: Adapt code for N-Port VF · 449fc488
      Matan Barak 提交于
      Adds support for N-Port VFs, this includes:
      1. Adding support in the wrapped FW command
      	In wrapped commands, we need to verify and convert
      	the slave's port into the real physical port.
      	Furthermore, when sending the response back to the slave,
      	a reverse conversion should be made.
      2. Adjusting sqpn for QP1 para-virtualization
      	The slave assumes that sqpn is used for QP1 communication.
      	If the slave is assigned to a port != (first port), we need
      	to adjust the sqpn that will direct its QP1 packets into the
      	correct endpoint.
      3. Adjusting gid[5] to modify the port for raw ethernet
      	In B0 steering, gid[5] contains the port. It needs
      	to be adjusted into the physical port.
      4. Adjusting number of ports in the query / ports caps in the FW commands
      	When a slave queries the hardware, it needs to view only
      	the physical ports it's assigned to.
      5. Adjusting the sched_qp according to the port number
      	The QP port is encoded in the sched_qp, thus in modify_qp we need
      	to encode the correct port in sched_qp.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      449fc488
  8. 17 1月, 2014 1 次提交
  9. 10 12月, 2013 1 次提交
    • J
      mlx4_core: Roll back round robin bitmap allocation commit for CQs, SRQs, and MPTs · 7c6d74d2
      Jack Morgenstein 提交于
      Commit f4ec9e95 "mlx4_core: Change bitmap allocator to work in round-robin fashion"
      introduced round-robin allocation (via bitmap) for all resources which allocate
      via a bitmap.
      
      Round robin allocation is desirable for mcgs, counters, pd's, UARs, and xrcds.
      These are simply numbers, with no involvement of ICM memory mapping.
      
      Round robin is required for QPs, since we had a problem with immediate
      reuse of a 24-bit QP number (commit f4ec9e95).
      
      However, for other resources which use the bitmap allocator and involve
      mapping ICM memory -- MPTs, CQs, SRQs -- round-robin is not desirable.
      
      What happens in these cases is the following:
      
      ICM memory is allocated and mapped in chunks of 256K.
      
      Since the resource allocation index goes up monotonically, the allocator
      will eventually require mapping a new chunk. Now, chunks are also unmapped
      when their reference count goes back to zero.  Thus, if a single app is
      running and starts/exits frequently we will have the following situation:
      
      When the app starts, a new chunk must be allocated and mapped.
      
      When the app exits, the chunk reference count goes back to zero, and the
      chunk is unmapped and freed. Therefore, the app must pay the cost of allocation
      and mapping of ICM memory each time it runs (although the price is paid only when
      allocating the initial entry in the new chunk).
      
      For apps which allocate MPTs/SRQs/CQs and which operate as described above,
      this presented a performance problem.
      
      We therefore roll back the round-robin allocator modification for MPTs, CQs, SRQs.
      Reported-by: NMatthew Finlay <matt@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c6d74d2
  10. 08 11月, 2013 1 次提交
  11. 29 7月, 2013 1 次提交
  12. 14 6月, 2013 1 次提交
  13. 25 4月, 2013 1 次提交
  14. 22 3月, 2013 1 次提交
  15. 30 11月, 2012 1 次提交
  16. 27 11月, 2012 1 次提交
    • O
      mlx4: 64-byte CQE/EQE support · 08ff3235
      Or Gerlitz 提交于
      ConnectX-3 devices can use either 64- or 32-byte completion queue
      entries (CQEs) and event queue entries (EQEs).  Using 64-byte
      EQEs/CQEs performs better because each entry is aligned to a complete
      cacheline.  This patch queries the HCA's capabilities, and if it
      supports 64-byte CQEs and EQES the driver will configure the HW to
      work in 64-byte mode.
      
      The 32-byte vs 64-byte mode is global per HCA and not per CQ or EQ.
      
      Since this mode is global, userspace (libmlx4) must be updated to work
      with the configured CQE size, and guests using SR-IOV virtual
      functions need to know both EQE and CQE size.
      
      In case one of the 64-byte CQE/EQE capabilities is activated, the
      patch makes sure that older guest drivers that use the QUERY_DEV_FUNC
      command (e.g as done in mlx4_core of Linux 3.3..3.6) will notice that
      they need an update to be able to work with the PPF. This is done by
      changing the returned pf_context_behaviour not to be zero any more. In
      case none of these capabilities is activated that value remains zero
      and older guest drivers can run OK.
      
      The SRIOV related flow is as follows
      
      1. the PPF does the detection of the new capabilities using
         QUERY_DEV_CAP command.
      
      2. the PPF activates the new capabilities using INIT_HCA.
      
      3. the VF detects if the PPF activated the capabilities using
         QUERY_HCA, and if this is the case activates them for itself too.
      
      Note that the VF detects that it must be aware to the new PF behaviour
      using QUERY_FUNC_CAP.  Steps 1 and 2 apply also for native mode.
      
      User space notification is done through a new field introduced in
      struct mlx4_ib_ucontext which holds device capabilities for which user
      space must take action. This changes the binary interface so the ABI
      towards libmlx4 exposed through uverbs is bumped from 3 to 4 but only
      when **needed** i.e. only when the driver does use 64-byte CQEs or
      future device capabilities which must be in sync by user space. This
      practice allows to work with unmodified libmlx4 on older devices (e.g
      A0, B0) which don't support 64-byte CQEs.
      
      In order to keep existing systems functional when they update to a
      newer kernel that contains these changes in VF and userspace ABI, a
      module parameter enable_64b_cqe_eqe must be set to enable 64-byte
      mode; the default is currently false.
      Signed-off-by: NEli Cohen <eli@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      08ff3235
  17. 19 11月, 2012 1 次提交
  18. 26 10月, 2012 1 次提交
  19. 24 10月, 2012 1 次提交
  20. 01 10月, 2012 2 次提交
    • J
      IB/mlx4: Miscellaneous adjustments for SR-IOV IB support · 992e8e6e
      Jack Morgenstein 提交于
      1. Allow only master to change node description.
      2. Prevent AH leakage in send mads.
      3. Take device part number from PCI structure, so that guests see the
         VF part number (and not the PF part number).
      4. Place the device revision ID into caps structure at startup.
      5. SET_PORT in update_gids_task needs to go through wrapper on master.
      6. In mlx4_ib_event(), PORT_MGMT_EVENT needs be handled in a work
         queue on the master, since it propagates events to slaves using
         GEN_EQE.
      7. Do not support FMR on slaves.
      8. Add spinlock to slave_event(), since it is called both in interrupt
         context and in process context (due to 6 above, and also if
         smp_snoop is used).  This fix was found and implemented by Saeed
         Mahameed <saeedm@mellanox.com>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      992e8e6e
    • J
      mlx4_core: Add IB port-state machine and port mgmt event propagation · 993c401e
      Jack Morgenstein 提交于
      For an IB port, a slave should not show port active until that slave
      has a valid alias-guid (provided by the subnet manager).  Therefore
      the port-up event should be passed to a slave only after both the port
      is up, and the slave's alias-guid has been set.
      
      Also, provide the infrastructure for propagating port-management
      events (client-reregister, etc) to slaves.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      993c401e
  21. 19 7月, 2012 1 次提交
  22. 11 7月, 2012 1 次提交
    • J
      mlx4: Use port management change event instead of smp_snoop · 00f5ce99
      Jack Morgenstein 提交于
      The port management change event can replace smp_snoop.  If the
      capability bit for this event is set in dev-caps, the event is used
      (by the driver setting the PORT_MNG_CHG_EVENT bit in the async event
      mask in the MAP_EQ fw command).  In this case, when the driver passes
      incoming SMP PORT_INFO SET mads to the FW, the FW generates port
      management change events to signal any changes to the driver.
      
      If the FW generates these events, smp_snoop shouldn't be invoked in
      ib_process_mad(), or duplicate events will occur (once from the
      FW-generated event, and once from smp_snoop).
      
      In the case where the FW does not generate port management change
      events smp_snoop needs to be invoked to create these events.  The flow
      in smp_snoop has been modified to make use of the same procedures as
      in the fw-generated-event event case to generate the port management
      events (LID change, Client-rereg, Pkey change, and/or GID change).
      
      Port management change event handling required changing the
      mlx4_ib_event and mlx4_dispatch_event prototypes; the "param" argument
      (last argument) had to be changed to unsigned long in order to
      accomodate passing the EQE pointer.
      
      We also needed to move the definition of struct mlx4_eqe from
      net/mlx4.h to file device.h -- to make it available to the IB driver,
      to handle port management change events.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      00f5ce99
  23. 01 6月, 2012 1 次提交
  24. 13 3月, 2012 1 次提交
  25. 22 2月, 2012 1 次提交
    • Y
      mlx4: Replacing pool_lock with mutex · 730c41d5
      Yevgeny Petrilin 提交于
      Under the spinlock we call request_irq(), which allocates memory with GFP_KERNEL,
      This causes the following trace when DEBUG_SPINLOCK is enabled, it can cause
      the following trace:
      
       BUG: spinlock wrong CPU on CPU#2, ethtool/2595
       lock: ffff8801f9cbc2b0, .magic: dead4ead, .owner: ethtool/2595, .owner_cpu: 0
       Pid: 2595, comm: ethtool Not tainted 3.0.18 #2
       Call Trace:
       spin_bug+0xa2/0xf0
       do_raw_spin_unlock+0x71/0xa0
       _raw_spin_unlock+0xe/0x10
       mlx4_assign_eq+0x12b/0x190 [mlx4_core]
       mlx4_en_activate_cq+0x252/0x2d0 [mlx4_en]
       ? mlx4_en_activate_rx_rings+0x227/0x370 [mlx4_en]
       mlx4_en_start_port+0x189/0xb90 [mlx4_en]
       mlx4_en_set_ringparam+0x29a/0x340 [mlx4_en]
       dev_ethtool+0x816/0xb10
       ? dev_get_by_name_rcu+0xa4/0xe0
       dev_ioctl+0x2b5/0x470
       handle_mm_fault+0x1cd/0x2d0
       sock_do_ioctl+0x5d/0x70
       sock_ioctl+0x79/0x2f0
       do_vfs_ioctl+0x8c/0x340
       sys_ioctl+0xa1/0xb0
       system_call_fastpath+0x16/0x1b
      
      Replacing with mutex, which is enough in this case.
      Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      730c41d5
  26. 14 2月, 2012 1 次提交
  27. 23 1月, 2012 2 次提交
  28. 14 12月, 2011 4 次提交
    • E
      mlx4_core: resource tracking for HCA resources used by guests · c82e9aa0
      Eli Cohen 提交于
      The resource tracker is used to track usage of HCA resources by the different
      guests.
      
      Virtual functions (VFs) are attached to guest operating systems but
      resources are allocated from the same pool and are assigned to VFs. It is
      essential that hostile/buggy guests not be able to affect the operation of
      other VFs, possibly attached to other guest OSs since ConnectX firmware is not
      tolerant to misuse of resources.
      
      The resource tracker module associates each resource with a VF and maintains
      state information for the allocated object. It also defines allowed state
      transitions and enforces them.
      
      Relationships between resources are also referred to. For example, CQs are
      pointed to by QPs, so it is forbidden to destroy a CQ if a QP refers to it.
      
      ICM memory is always accessible through the primary function and hence it is
      allocated by the owner of the primary function.
      
      When a guest dies, an FLR is generated for all the VFs it owns and all the
      resources it used are freed.
      
      The tracked resource types are: QPs, CQs, SRQs, MPTs, MTTs, MACs, RES_EQs,
      and XRCDNs.
      Signed-off-by: NEli Cohen <eli@mellanox.co.il>
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c82e9aa0
    • J
      mlx4_core: Add wrapper functions and comm channel and slave event support to EQs · acba2420
      Jack Morgenstein 提交于
      Passing async events to slaves:
      In SRIOV mode, each slave creates its own async EQ, but only the master can
      register directly with the FW to receive async events.  Async events which
      should be passed to slaves (such as a WQ_ACCESS_ERROR for a QP owned by a slave)
      are generated at the slave by the master using the GEN_EQE FW command.
      
      Wrapper functions: mlx4_MAP_EQ_wrapper
      Only the master can map an EQ. The slave commands to map their EQs arrive
      at the master via the comm channel.  The master then invokes the wrapper
      function to do the work (and enter the resource in the tracking database).
      
      New events: COMM_CHANNEL and FLR
      The COMM_CHANNEL event arrives only at the master, and signals that
      a slave has posted a command on the comm channel.
      The FLR event is generated by the FW when a guest operating a VF
      unexpectedly goes down.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      acba2420
    • J
      mlx4_core: Add "native" argument to mlx4_cmd and its callers (where needed) · f9baff50
      Jack Morgenstein 提交于
      For SRIOV, some Hypervisor commands can be executed directly (native = 1).
      Others should go through the command wrapper flow (for tracking resource
      usage, for example, or for changing some HCA configurations that slaves
      need to be notified of).
      
      This patch sets the groundwork for this capability -- adding the correct
      value of "native" in each case.
      
      Note that if SRIOV is not activated, this parameter has no effect.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9baff50
    • J
      mlx4_core: initial header-file changes for SRIOV support · 623ed84b
      Jack Morgenstein 提交于
      These changes will not affect module operation as yet. They
      are only to get some structs and enums in place for use by
      subsequent patches (making those smaller).
      
      Added here:
      * sriov state structs and inlines (mlx4_is_master/slave/mfunc)
      * comm-channel and vhcr support structures
      * enum values for new FW and comm-channel virtual commands
        (i.e., commands, passed via the comm channel to the PF-driver).
      * prototypes for many command wrapper functions (used by the
        PF context for processing FW commands passed to it by the VFs).
      * struct mlx4_eqe is moved from eq.c to mlx4.h (it will be used
        by other mlx4_core source files).
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      623ed84b
  29. 01 11月, 2011 1 次提交
  30. 11 8月, 2011 1 次提交
  31. 31 3月, 2011 1 次提交
  32. 24 3月, 2011 1 次提交
    • Y
      mlx4: Changing interrupt scheme · 0b7ca5a9
      Yevgeny Petrilin 提交于
      Adding a pool of MSI-X vectors and EQs that can be used explicitly by mlx4_core
      customers (mlx4_ib, mlx4_en). The consumers will assign their own names to the
      interrupt vectors. Those vectors are not opened at mlx4 device initialization,
      opened by demand.
      Changed the max number of possible EQs according to the new scheme, no longer relies on
      on number of cores.
      The new functionality is exposed through mlx4_assign_eq() and mlx4_release_eq().
      Customers that do not use the new API will get completion vectors as before.
      Signed-off-by: NMarkuze Alex <markuze@mellanox.co.il>
      Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b7ca5a9
  33. 25 8月, 2010 1 次提交
  34. 16 7月, 2010 1 次提交
    • J
      drivers/net/mlx4: Use %pV, pr_<level>, printk_once · 0a645e80
      Joe Perches 提交于
      Remove near duplication of format string constants by using the newly
      introduced vsprintf extention %pV to reduce text by 20k or so.
      
      $ size drivers/net/mlx4/built-in.o*
         text	   data	    bss	    dec	    hex	filename
       161367	   1866	  48784	 212017	  33c31	drivers/net/mlx4/built-in.o
       142621	   1866	  46248	 190735	  2e90f	drivers/net/mlx4/built-in.o.new
      
      Use printk_once as appropriate.
      Convert printks to pr_<level>, some bare printks now use pr_cont.
      Remove now unused #define PFX.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a645e80