1. 04 11月, 2017 6 次提交
  2. 03 11月, 2017 5 次提交
  3. 02 11月, 2017 1 次提交
  4. 01 11月, 2017 17 次提交
  5. 28 10月, 2017 1 次提交
  6. 27 10月, 2017 2 次提交
  7. 26 10月, 2017 4 次提交
    • H
      net/mlx5e: DCBNL, Implement tc with ets type and zero bandwidth · be0f161e
      Huy Nguyen 提交于
      Previously, tc with ets type and zero bandwidth is not accepted
      by driver. This behavior does not follow the IEEE802.1qaz spec.
      
      If there are tcs with ets type and zero bandwidth, these tcs are
      assigned to the lowest priority tc_group #0. We equally distribute
      100% bw of the tc_group #0 to these zero bandwidth ets tcs.
      Also, the non zero bandwidth ets tcs are assigned to tc_group #1.
      
      If there is no zero bandwidth ets tc, the non zero bandwidth ets tcs
      are assigned to tc_group #0.
      
      Fixes: cdcf1121 ("net/mlx5e: Validate BW weight values of ETS")
      Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
      Reviewed-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      be0f161e
    • O
      net/mlx5e: Properly deal with encap flows add/del under neigh update · 3c37745e
      Or Gerlitz 提交于
      Currently, the encap action offload is handled in the actions parse
      function and not in mlx5e_tc_add_fdb_flow() where we deal with all
      the other aspects of offloading actions (vlan, modify header) and
      the rule itself.
      
      When the neigh update code (mlx5e_tc_encap_flows_add()) recreates the
      encap entry and offloads the related flows, we wrongly call again into
      mlx5e_tc_add_fdb_flow(), this for itself would cause us to handle
      again the offloading of vlans and header re-write which puts things
      in non consistent state and step on freed memory (e.g the modify
      header parse buffer which is already freed).
      
      Since on error, mlx5e_tc_add_fdb_flow() detaches and may release the
      encap entry, it causes a corruption at the neigh update code which goes
      over the list of flows associated with this encap entry, or double free
      when the tc flow is later deleted by user-space.
      
      When neigh update (mlx5e_tc_encap_flows_del()) unoffloads the flows related
      to an encap entry which is now invalid, we do a partial repeat of the eswitch
      flow removal code which is wrong too.
      
      To fix things up we do the following:
      
      (1) handle the encap action offload in the eswitch flow add function
          mlx5e_tc_add_fdb_flow() as done for the other actions and the rule itself.
      
      (2) modify the neigh update code (mlx5e_tc_encap_flows_add/del) to only
          deal with the encap entry and rules delete/add and not with any of
          the other offloaded actions.
      
      Fixes: 232c0013 ('net/mlx5e: Add support to neighbour update flow')
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: NPaul Blakey <paulb@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      3c37745e
    • H
      net/mlx5: Delay events till mlx5 interface's add complete for pci resume · 4ca637a2
      Huy Nguyen 提交于
      mlx5_ib_add is called during mlx5_pci_resume after a pci error.
      Before mlx5_ib_add completes, there are multiple events which trigger
      function mlx5_ib_event. This cause kernel panic because mlx5_ib_event
      accesses unitialized resources.
      
      The fix is to extend Erez Shitrit's patch <97834eba>
      ("net/mlx5: Delay events till ib registration ends") to cover
      the pci resume code path.
      
      Trace:
      mlx5_core 0001:01:00.6: mlx5_pci_resume was called
      mlx5_core 0001:01:00.6: firmware version: 16.20.1011
      mlx5_core 0001:01:00.6: mlx5_attach_interface:164:(pid 779):
      mlx5_ib_event:2996:(pid 34777): warning: event on port 1
      mlx5_ib_event:2996:(pid 34782): warning: event on port 1
      Unable to handle kernel paging request for data at address 0x0001c104
      Faulting instruction address: 0xd000000008f411fc
      Oops: Kernel access of bad area, sig: 11 [#1]
      ...
      ...
      Call Trace:
      [c000000fff77bb70] [d000000008f4119c] mlx5_ib_event+0x64/0x470 [mlx5_ib] (unreliable)
      [c000000fff77bc60] [d000000008e67130] mlx5_core_event+0xb8/0x210 [mlx5_core]
      [c000000fff77bd10] [d000000008e4bd00] mlx5_eq_int+0x528/0x860[mlx5_core]
      
      Fixes: 97834eba ("net/mlx5: Delay events till ib registration ends")
      Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
      Reviewed-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      4ca637a2
    • M
      net/mlx5: Fix health work queue spin lock to IRQ safe · 6377ed0b
      Moshe Shemesh 提交于
      spin_lock/unlock of health->wq_lock should be IRQ safe.
      It was changed to spin_lock_irqsave since adding commit 0179720d
      ("net/mlx5: Introduce trigger_health_work function") which uses
      spin_lock from asynchronous event (IRQ) context.
      Thus, all spin_lock/unlock of health->wq_lock should have been moved
      to IRQ safe mode.
      However, one occurrence on new code using this lock missed that
      change, resulting in possible deadlock:
        kernel: Possible unsafe locking scenario:
        kernel:       CPU0
        kernel:       ----
        kernel:  lock(&(&health->wq_lock)->rlock);
        kernel:  <Interrupt>
        kernel:    lock(&(&health->wq_lock)->rlock);
        kernel: #012 *** DEADLOCK ***
      
      Fixes: 2a0165a0 ("net/mlx5: Cancel delayed recovery work when unloading the driver")
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      6377ed0b
  8. 24 10月, 2017 4 次提交