1. 19 1月, 2019 3 次提交
    • E
      devlink: Add health report functionality · c7af343b
      Eran Ben Elisha 提交于
      Upon error discover, every driver can report it to the devlink health
      mechanism via devlink_health_report function, using the appropriate
      reporter registered to it. Driver can pass error specific context which
      will be delivered to it as part of the dump / recovery callbacks.
      
      Once an error is reported, devlink health will do the following actions:
      * A log is being send to the kernel trace events buffer
      * Health status and statistics are being updated for the reporter instance
      * Object dump is being taken and stored at the reporter instance (as long
        as there is no other dump which is already stored)
      * Auto recovery attempt is being done. depends on:
        - Auto Recovery configuration
        - Grace period vs. time since last recover
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7af343b
    • E
      devlink: Add health reporter create/destroy functionality · 880ee82f
      Eran Ben Elisha 提交于
      Devlink health reporter is an instance for reporting, diagnosing and
      recovering from run time errors discovered by the reporters.
      Define it's data structure and supported operations.
      In addition, expose devlink API to create and destroy a reporter.
      Each devlink instance will hold it's own reporters list.
      
      As part of the allocation, driver shall provide a set of callbacks which
      will be used the devlink in order to handle health reports and user
      commands related to this reporter. In addition, driver is entitled to
      provide some priv pointer, which can be fetched from the reporter by
      devlink_health_reporter_priv function.
      
      For each reporter, devlink will hold a metadata of statistics,
      buffers and status.
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      880ee82f
    • E
      devlink: Add health buffer support · cb5ccfbe
      Eran Ben Elisha 提交于
      Devlink health buffer is a mechanism to pass descriptors between drivers
      and devlink. The API allows the driver to add objects, object pair,
      value array (nested attributes), value and name.
      
      Driver can use this API to fill the buffers in a format which can be
      translated by the devlink to the netlink message.
      
      In order to fulfill it, an internal buffer descriptor is defined. This
      will hold the data and metadata per each attribute and by used to pass
      actual commands to the netlink.
      
      This mechanism will be later used in devlink health for dump and diagnose
      data store by the drivers.
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Reviewed-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb5ccfbe
  2. 04 12月, 2018 1 次提交
  3. 11 10月, 2018 3 次提交
  4. 09 10月, 2018 1 次提交
  5. 05 10月, 2018 3 次提交
  6. 04 10月, 2018 1 次提交
  7. 22 9月, 2018 1 次提交
  8. 13 7月, 2018 8 次提交
  9. 05 7月, 2018 7 次提交
  10. 06 6月, 2018 1 次提交
  11. 29 5月, 2018 1 次提交
  12. 24 5月, 2018 1 次提交
  13. 20 5月, 2018 3 次提交
  14. 09 4月, 2018 1 次提交
    • J
      devlink: convert occ_get op to separate registration · fc56be47
      Jiri Pirko 提交于
      This resolves race during initialization where the resources with
      ops are registered before driver and the structures used by occ_get
      op is initialized. So keep occ_get callbacks registered only when
      all structs are initialized.
      
      The example flows, as it is in mlxsw:
      1) driver load/asic probe:
         mlxsw_core
            -> mlxsw_sp_resources_register
              -> mlxsw_sp_kvdl_resources_register
                -> devlink_resource_register IDX
         mlxsw_spectrum
            -> mlxsw_sp_kvdl_init
              -> mlxsw_sp_kvdl_parts_init
                -> mlxsw_sp_kvdl_part_init
                  -> devlink_resource_size_get IDX (to get the current setup
                                                    size from devlink)
              -> devlink_resource_occ_get_register IDX (register current
                                                        occupancy getter)
      2) reload triggered by devlink command:
        -> mlxsw_devlink_core_bus_device_reload
          -> mlxsw_sp_fini
            -> mlxsw_sp_kvdl_fini
      	-> devlink_resource_occ_get_unregister IDX
          (struct mlxsw_sp *mlxsw_sp is freed at this point, call to occ get
           which is using mlxsw_sp would cause use-after free)
          -> mlxsw_sp_init
            -> mlxsw_sp_kvdl_init
              -> mlxsw_sp_kvdl_parts_init
                -> mlxsw_sp_kvdl_part_init
                  -> devlink_resource_size_get IDX (to get the current setup
                                                    size from devlink)
              -> devlink_resource_occ_get_register IDX (register current
                                                        occupancy getter)
      
      Fixes: d9f9b9a4 ("devlink: Add support for resource abstraction")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fc56be47
  15. 23 3月, 2018 1 次提交
  16. 20 3月, 2018 1 次提交
  17. 09 3月, 2018 1 次提交
  18. 01 3月, 2018 1 次提交
  19. 28 2月, 2018 1 次提交