1. 28 9月, 2021 1 次提交
  2. 12 8月, 2021 4 次提交
    • P
      net/mlx5: Allocate individual capability · 48f02eef
      Parav Pandit 提交于
      Currently mlx5_core_dev contains array of capabilities. It contains 19
      valid capabilities of the device, 2 reserved entries and 12 holes.
      Due to this for 14 unused entries, mlx5_core_dev allocates 14 * 8K = 112K
      bytes of memory which is never used. Due to this mlx5_core_dev structure
      size is 270Kbytes odd. This allocation further aligns to next power of 2
      to 512Kbytes.
      
      By skipping non-existent entries,
      (a) 112Kbyte is saved,
      (b) mlx5_core_dev reduces to 8KB with alignment
      (c) 350KB saved in alignment
      
      In future individual capability allocation can be used to skip its
      allocation when such capability is disabled at the device level. This
      patch prepares mlx5_core_dev to hold capability using a pointer instead
      of inline array.
      Signed-off-by: NParav Pandit <parav@nvidia.com>
      Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: NShay Drory <shayd@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      48f02eef
    • P
      net/mlx5: Reorganize current and maximal capabilities to be per-type · 5958a6fa
      Parav Pandit 提交于
      In the current code, the current and maximal capabilities are
      maintained in separate arrays which are both per type. In order to
      allow the creation of such a basic structure as a dynamically
      allocated array, we move curr and max fields to a unified
      structure so that specific capabilities can be allocated as one unit.
      Signed-off-by: NParav Pandit <parav@nvidia.com>
      Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: NShay Drory <shayd@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      5958a6fa
    • L
      net/mlx5: Delete impossible dev->state checks · 8e792700
      Leon Romanovsky 提交于
      New mlx5_core device structure is allocated through devlink_alloc
      with\ kzalloc and that ensures that all fields are equal to zero
      and it includes ->state too.
      
      That means that checks of that field in the mlx5_init_one() is
      completely redundant, because that function is called only once
      in the begging of mlx5_core_dev lifetime.
      
      PCI:
       .probe()
        -> probe_one()
         -> mlx5_init_one()
      
      The recovery flow can't run at that time or before it, because relevant
      work initialized later in mlx5_init_once().
      
      Such initialization flow ensures that dev->state can't be
      MLX5_DEVICE_STATE_UNINITIALIZED at all, so remove such impossible
      checks.
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      8e792700
    • C
      net/mlx5: Fix typo in comments · 39c538d6
      Cai Huoqing 提交于
      Fix typo:
      *vectores  ==> vectors
      *realeased  ==> released
      *erros  ==> errors
      *namepsace  ==> namespace
      *trafic  ==> traffic
      *proccessed  ==> processed
      *retore  ==> restore
      *Currenlty  ==> Currently
      *crated  ==> created
      *chane  ==> change
      *cannnot  ==> cannot
      *usuallly  ==> usually
      *failes  ==> fails
      *importent  ==> important
      *reenabled  ==> re-enabled
      *alocation  ==> allocation
      *recived  ==> received
      *tanslation  ==> translation
      Signed-off-by: NCai Huoqing <caihuoqing@baidu.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      39c538d6
  3. 10 8月, 2021 1 次提交
  4. 06 8月, 2021 1 次提交
  5. 17 6月, 2021 1 次提交
    • D
      net/mlx5e: Don't create devices during unload flow · a5ae8fc9
      Dmytro Linkin 提交于
      Running devlink reload command for port in switchdev mode cause
      resources to corrupt: driver can't release allocated EQ and reclaim
      memory pages, because "rdma" auxiliary device had add CQs which blocks
      EQ from deletion.
      Erroneous sequence happens during reload-down phase, and is following:
      
      1. detach device - suspends auxiliary devices which support it, destroys
         others. During this step "eth-rep" and "rdma-rep" are destroyed,
         "eth" - suspended.
      2. disable SRIOV - moves device to legacy mode; as part of disablement -
         rescans drivers. This step adds "rdma" auxiliary device.
      3. destroy EQ table - <failure>.
      
      Driver shouldn't create any device during unload flows. To handle that
      implement MLX5_PRIV_FLAGS_DETACH flag, set it on device detach and unset
      on device attach. If flag is set do no-op on drivers rescan.
      
      Fixes: a925b5e3 ("net/mlx5: Register mlx5 devices to auxiliary virtual bus")
      Signed-off-by: NDmytro Linkin <dlinkin@nvidia.com>
      Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: NRoi Dayan <roid@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      a5ae8fc9
  6. 28 5月, 2021 1 次提交
  7. 19 5月, 2021 1 次提交
  8. 17 4月, 2021 1 次提交
  9. 03 4月, 2021 2 次提交
    • P
      net/mlx5: Allocate rate limit table when rate is configured · 6b30b6d4
      Parav Pandit 提交于
      A device supports 128 rate limiters. A static table allocation consumes
      8KB of memory even when rate is not configured.
      
      Instead, allocate the table when at least one rate is configured.
      Signed-off-by: NParav Pandit <parav@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      6b30b6d4
    • P
      net/mlx5: Pack mlx5_rl_entry structure · 4c4c0a89
      Parav Pandit 提交于
      mlx5_rl_entry structure is not properly packed as shown below. Due to this
      an array of size 9144 bytes allocated which is aligned to 16Kbytes.
      Hence, pack the structure and avoid the wastage.
      
      This offers 8Kbytes of saving per mlx5_core_dev struct.
      
      pahole -C mlx5_rl_entry  drivers/net/ethernet/mellanox/mlx5/core/en_main.o
      
      Existing layout:
      
      struct mlx5_rl_entry {
              u8                         rl_raw[48];           /*     0    48 */
              u16                        index;                /*    48     2 */
      
              /* XXX 6 bytes hole, try to pack */
      
              u64                        refcount;             /*    56     8 */
              /* --- cacheline 1 boundary (64 bytes) --- */
              u16                        uid;                  /*    64     2 */
              u8                         dedicated:1;          /*    66: 0  1 */
      
              /* size: 72, cachelines: 2, members: 5 */
              /* sum members: 60, holes: 1, sum holes: 6 */
              /* sum bitfield members: 1 bits (0 bytes) */
              /* padding: 5 */
              /* bit_padding: 7 bits */
              /* last cacheline: 8 bytes */
      };
      
      After alignment:
      
      struct mlx5_rl_entry {
              u8                         rl_raw[48];           /*     0    48 */
              u64                        refcount;             /*    48     8 */
              u16                        index;                /*    56     2 */
              u16                        uid;                  /*    58     2 */
              u8                         dedicated:1;          /*    60: 0  1 */
      
              /* size: 64, cachelines: 1, members: 5 */
              /* padding: 3 */
              /* bit_padding: 7 bits */
      };
      Signed-off-by: NParav Pandit <parav@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      4c4c0a89
  10. 17 3月, 2021 3 次提交
  11. 13 3月, 2021 2 次提交
  12. 12 3月, 2021 1 次提交
  13. 17 2月, 2021 2 次提交
  14. 11 2月, 2021 1 次提交
  15. 09 2月, 2021 1 次提交
  16. 06 2月, 2021 1 次提交
  17. 28 1月, 2021 2 次提交
  18. 23 1月, 2021 4 次提交
    • P
      net/mlx5: SF, Add port add delete functionality · 8f010541
      Parav Pandit 提交于
      To handle SF port management outside of the eswitch as independent
      software layer, introduce eswitch notifier APIs so that mlx5 upper
      layer who wish to support sf port management in switchdev mode can
      perform its task whenever eswitch mode is set to switchdev or before
      eswitch is disabled.
      
      Initialize sf port table on such eswitch event.
      
      Add SF port add and delete functionality in switchdev mode.
      Destroy all SF ports when eswitch is disabled.
      Expose SF port add and delete to user via devlink commands.
      
      $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
      
      $ devlink port show
      pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false
      
      $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88
      pci/0000:06:00.0/32768: type eth netdev eth6 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
        function:
          hw_addr 00:00:00:00:00:00 state inactive opstate detached
      
      $ devlink port show ens2f0npf0sf88
      pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
        function:
          hw_addr 00:00:00:00:00:00 state inactive opstate detached
      
      or by its unique port index:
      $ devlink port show pci/0000:06:00.0/32768
      pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
        function:
          hw_addr 00:00:00:00:00:00 state inactive opstate detached
      
      $ devlink port show ens2f0npf0sf88 -jp
      {
          "port": {
              "pci/0000:06:00.0/32768": {
                  "type": "eth",
                  "netdev": "ens2f0npf0sf88",
                  "flavour": "pcisf",
                  "controller": 0,
                  "pfnum": 0,
                  "sfnum": 88,
                  "external": false,
                  "splittable": false,
                  "function": {
                      "hw_addr": "00:00:00:00:00:00",
                      "state": "inactive",
                      "opstate": "detached"
                  }
              }
          }
      }
      Signed-off-by: NParav Pandit <parav@nvidia.com>
      Reviewed-by: NVu Pham <vuhuong@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      8f010541
    • P
      net/mlx5: SF, Add auxiliary device driver · 1958fc2f
      Parav Pandit 提交于
      Add auxiliary device driver for mlx5 subfunction auxiliary device.
      
      A mlx5 subfunction is similar to PCI PF and VF. For a subfunction
      an auxiliary device is created.
      
      As a result, when mlx5 SF auxiliary device binds to the driver,
      its netdev and rdma device are created, they appear as
      
      $ ls -l /sys/bus/auxiliary/devices/
      mlx5_core.sf.4 -> ../../../devices/pci0000:00/0000:00:03.0/0000:06:00.0/mlx5_core.sf.4
      
      $ ls -l /sys/class/net/eth1/device
      /sys/class/net/eth1/device -> ../../../mlx5_core.sf.4
      
      $ cat /sys/bus/auxiliary/devices/mlx5_core.sf.4/sfnum
      88
      
      $ devlink dev show
      pci/0000:06:00.0
      auxiliary/mlx5_core.sf.4
      
      $ devlink port show auxiliary/mlx5_core.sf.4/1
      auxiliary/mlx5_core.sf.4/1: type eth netdev p0sf88 flavour virtual port 0 splittable false
      
      $ rdma link show mlx5_0/1
      link mlx5_0/1 state ACTIVE physical_state LINK_UP netdev p0sf88
      
      $ rdma dev show
      8: rocep6s0f1: node_type ca fw 16.29.0550 node_guid 248a:0703:00b3:d113 sys_image_guid 248a:0703:00b3:d112
      13: mlx5_0: node_type ca fw 16.29.0550 node_guid 0000:00ff:fe00:8888 sys_image_guid 248a:0703:00b3:d112
      
      In future, devlink device instance name will adapt to have sfnum
      annotation using either an alias or as devlink instance name described
      in RFC [1].
      
      [1] https://lore.kernel.org/netdev/20200519092258.GF4655@nanopsycho/Signed-off-by: NParav Pandit <parav@nvidia.com>
      Reviewed-by: NVu Pham <vuhuong@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      1958fc2f
    • P
      net/mlx5: SF, Add auxiliary device support · 90d010b8
      Parav Pandit 提交于
      Introduce API to add and delete an auxiliary device for an SF.
      Each SF has its own dedicated window in the PCI BAR 2.
      
      SF device is similar to PCI PF and VF that supports multiple class of
      devices such as net, rdma and vdpa.
      
      SF device will be added or removed in subsequent patch during SF
      devlink port function state change command.
      
      A subfunction device exposes user supplied subfunction number which will
      be further used by systemd/udev to have deterministic name for its
      netdevice and rdma device.
      
      An mlx5 subfunction auxiliary device example:
      
      $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
      
      $ devlink port show
      pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false
      
      $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88
      pci/0000:08:00.0/32768: type eth netdev eth6 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
        function:
          hw_addr 00:00:00:00:00:00 state inactive opstate detached
      
      $ devlink port show ens2f0npf0sf88
      pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
        function:
          hw_addr 00:00:00:00:88:88 state inactive opstate detached
      
      $ devlink port function set ens2f0npf0sf88 hw_addr 00:00:00:00:88:88 state active
      
      On activation,
      
      $ ls -l /sys/bus/auxiliary/devices/
      mlx5_core.sf.4 -> ../../../devices/pci0000:00/0000:00:03.0/0000:06:00.0/mlx5_core.sf.4
      
      $ cat /sys/bus/auxiliary/devices/mlx5_core.sf.4/sfnum
      88
      Signed-off-by: NParav Pandit <parav@nvidia.com>
      Reviewed-by: NVu Pham <vuhuong@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      90d010b8
    • P
      net/mlx5: Introduce vhca state event notifier · f3196bb0
      Parav Pandit 提交于
      vhca state events indicates change in the state of the vhca that may
      occur due to a SF allocation, deallocation or enabling/disabling the
      SF HCA.
      
      Introduce vhca state event handler which will be used by SF devlink
      port manager and SF hardware id allocator in subsequent patches
      to act on the event.
      
      This enables single entity to subscribe, query and rearm the event
      for a function.
      Signed-off-by: NParav Pandit <parav@nvidia.com>
      Reviewed-by: NVu Pham <vuhuong@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      f3196bb0
  19. 20 1月, 2021 1 次提交
  20. 06 12月, 2020 1 次提交
  21. 04 12月, 2020 2 次提交
    • L
      net/mlx5: Register mlx5 devices to auxiliary virtual bus · a925b5e3
      Leon Romanovsky 提交于
      Create auxiliary devices under new virtual bus. This will replace
      the custom-made mlx5 ->add()/->remove() interfaces and next patches
      will fill the missing callback and remove the old interface logic.
      
      The attachment of auxiliary drivers to the devices is possible in
      1-to-1 manner only and it requires us to create device for every protocol,
      so that device (module) will be able to connect to it.
      
      System with 2 IB and 1 RoCE cards:
      [leonro@vm ~]$ lspci |grep nox
      00:09.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
      00:0a.0 Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6]
      00:0b.0 Ethernet controller: Mellanox Technologies MT2910 Family [ConnectX-7]
      [leonro@vm ~]$ ls -l /sys/bus/auxiliary/devices/
       mlx5_core.eth.2 -> ../../../devices/pci0000:00/0000:00:0b.0/mlx5_core.eth.2
       mlx5_core.rdma.0 -> ../../../devices/pci0000:00/0000:00:09.0/mlx5_core.rdma.0
       mlx5_core.rdma.1 -> ../../../devices/pci0000:00/0000:00:0a.0/mlx5_core.rdma.1
       mlx5_core.rdma.2 -> ../../../devices/pci0000:00/0000:00:0b.0/mlx5_core.rdma.2
       mlx5_core.vdpa.1 -> ../../../devices/pci0000:00/0000:00:0a.0/mlx5_core.vdpa.1
       mlx5_core.vdpa.2 -> ../../../devices/pci0000:00/0000:00:0b.0/mlx5_core.vdpa.2
      [leonro@vm ~]$ rdma dev
      0: ibp0s9: node_type ca fw 4.6.9999 node_guid 5254:00c0:fe12:3455 sys_image_guid 5254:00c0:fe12:3455
      1: ibp0s10: node_type ca fw 4.6.9999 node_guid 5254:00c0:fe12:3456 sys_image_guid 5254:00c0:fe12:3456
      2: rdmap0s11: node_type ca fw 4.6.9999 node_guid 5254:00c0:fe12:3457 sys_image_guid 5254:00c0:fe12:3457
      
      System with RoCE SR-IOV card with 4 VFs:
      [leonro@vm ~]$ lspci |grep nox
      01:00.0 Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6]
      01:00.1 Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6 Virtual Function]
      01:00.2 Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6 Virtual Function]
      01:00.3 Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6 Virtual Function]
      01:00.4 Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6 Virtual Function]
      [leonro@vm ~]$ ls -l /sys/bus/auxiliary/devices/
       mlx5_core.eth.0 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.0/mlx5_core.eth.0
       mlx5_core.eth.1 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.1/mlx5_core.eth.1
       mlx5_core.eth.2 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.2/mlx5_core.eth.2
       mlx5_core.eth.3 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.3/mlx5_core.eth.3
       mlx5_core.eth.4 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.4/mlx5_core.eth.4
       mlx5_core.rdma.0 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.0/mlx5_core.rdma.0
       mlx5_core.rdma.1 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.1/mlx5_core.rdma.1
       mlx5_core.rdma.2 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.2/mlx5_core.rdma.2
       mlx5_core.rdma.3 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.3/mlx5_core.rdma.3
       mlx5_core.rdma.4 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.4/mlx5_core.rdma.4
       mlx5_core.vdpa.1 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.1/mlx5_core.vdpa.1
       mlx5_core.vdpa.2 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.2/mlx5_core.vdpa.2
       mlx5_core.vdpa.3 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.3/mlx5_core.vdpa.3
       mlx5_core.vdpa.4 -> ../../../devices/pci0000:00/0000:00:09.0/0000:01:00.4/mlx5_core.vdpa.4
      [leonro@vm ~]$ rdma dev
      0: rocep1s0f0: node_type ca fw 4.6.9999 node_guid 5254:00c0:fe12:3455 sys_image_guid 5254:00c0:fe12:3455
      1: rocep1s0f0v0: node_type ca fw 4.6.9999 node_guid 0000:0000:0000:0000 sys_image_guid 5254:00c0:fe12:3456
      2: rocep1s0f0v1: node_type ca fw 4.6.9999 node_guid 0000:0000:0000:0000 sys_image_guid 5254:00c0:fe12:3457
      3: rocep1s0f0v2: node_type ca fw 4.6.9999 node_guid 0000:0000:0000:0000 sys_image_guid 5254:00c0:fe12:3458
      4: rocep1s0f0v3: node_type ca fw 4.6.9999 node_guid 0000:0000:0000:0000 sys_image_guid 5254:00c0:fe12:3459
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      a925b5e3
    • L
      net/mlx5_core: Clean driver version and name · 17a7612b
      Leon Romanovsky 提交于
      Remove exposed driver version as it was done in other drivers,
      so module version will work correctly by displaying the kernel
      version for which it is compiled.
      
      And move mlx5_core module name to general include, so auxiliary drivers
      will be able to use it as a basis for a name in their device ID tables.
      Reviewed-by: NParav Pandit <parav@nvidia.com>
      Reviewed-by: NRoi Dayan <roid@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      17a7612b
  22. 27 11月, 2020 3 次提交
  23. 27 10月, 2020 1 次提交
    • P
      RDMA/mlx5: Fix devlink deadlock on net namespace deletion · fbdd0049
      Parav Pandit 提交于
      When a mlx5 core devlink instance is reloaded in different net namespace,
      its associated IB device is deleted and recreated.
      
      Example sequence is:
      $ ip netns add foo
      $ devlink dev reload pci/0000:00:08.0 netns foo
      $ ip netns del foo
      
      mlx5 IB device needs to attach and detach the netdevice to it through the
      netdev notifier chain during load and unload sequence.  A below call graph
      of the unload flow.
      
      cleanup_net()
         down_read(&pernet_ops_rwsem); <- first sem acquired
           ops_pre_exit_list()
             pre_exit()
               devlink_pernet_pre_exit()
                 devlink_reload()
                   mlx5_devlink_reload_down()
                     mlx5_unload_one()
                     [...]
                       mlx5_ib_remove()
                         mlx5_ib_unbind_slave_port()
                           mlx5_remove_netdev_notifier()
                             unregister_netdevice_notifier()
                               down_write(&pernet_ops_rwsem);<- recurrsive lock
      
      Hence, when net namespace is deleted, mlx5 reload results in deadlock.
      
      When deadlock occurs, devlink mutex is also held. This not only deadlocks
      the mlx5 device under reload, but all the processes which attempt to
      access unrelated devlink devices are deadlocked.
      
      Hence, fix this by mlx5 ib driver to register for per net netdev notifier
      instead of global one, which operats on the net namespace without holding
      the pernet_ops_rwsem.
      
      Fixes: 4383cfcc ("net/mlx5: Add devlink reload")
      Link: https://lore.kernel.org/r/20201026134359.23150-1-parav@nvidia.comSigned-off-by: NParav Pandit <parav@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      fbdd0049
  24. 10 10月, 2020 1 次提交
  25. 03 10月, 2020 1 次提交
    • S
      net/mlx5: cmdif, Avoid skipping reclaim pages if FW is not accessible · b898ce7b
      Saeed Mahameed 提交于
      In case of pci is offline reclaim_pages_cmd() will still try to call
      the FW to release FW pages, cmd_exec() in this case will return a silent
      success without actually calling the FW.
      
      This is wrong and will cause page leaks, what we should do is to detect
      pci offline or command interface un-available before tying to access the
      FW and manually release the FW pages in the driver.
      
      In this patch we share the code to check for FW command interface
      availability and we call it in sensitive places e.g. reclaim_pages_cmd().
      
      Alternative fix:
       1. Remove MLX5_CMD_OP_MANAGE_PAGES form mlx5_internal_err_ret_value,
          command success simulation list.
       2. Always Release FW pages even if cmd_exec fails in reclaim_pages_cmd().
      Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
      b898ce7b