1. 22 4月, 2021 4 次提交
  2. 21 4月, 2021 1 次提交
    • M
      RDMA/mlx5: Expose private query port · 9a89d3ad
      Mark Bloch 提交于
      Expose a non standard query port via IOCTL that will be used to expose
      port attributes that are specific to mlx5 devices.
      
      The new interface receives a port number to query and returns a structure
      that contains the available attributes for that port.  This will be used
      to fill the gap between pure DEVX use cases and use cases where a kernel
      needs to inform userspace about various kernel driver configurations that
      userspace must use in order to work correctly.
      
      Flags is used to indicate which fields are valid on return.
      
      MLX5_IB_UAPI_QUERY_PORT_VPORT:
      	The vport number of the queered port.
      
      MLX5_IB_UAPI_QUERY_PORT_VPORT_VHCA_ID:
      	The VHCA ID of the vport of the queered port.
      
      MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_RX:
      	The vport's RX ICM address used for sw steering.
      
      MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_TX:
      	The vport's TX ICM address used for sw steering.
      
      MLX5_IB_UAPI_QUERY_PORT_VPORT_REG_C0:
      	The metadata used to tag egress packets of the vport.
      
      MLX5_IB_UAPI_QUERY_PORT_ESW_OWNER_VHCA_ID:
      	The E-Switch owner vhca id of the vport.
      
      Link: https://lore.kernel.org/r/6e2ef13e5a266a6c037eb0105eb1564c7bb52f23.1618743394.git.leonro@nvidia.comReviewed-by: NMaor Gottlieb <maorg@nvidia.com>
      Signed-off-by: NMark Bloch <mbloch@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      9a89d3ad
  3. 14 4月, 2021 4 次提交
  4. 13 4月, 2021 1 次提交
  5. 08 4月, 2021 1 次提交
  6. 04 4月, 2021 2 次提交
    • L
      net/mlx5: Add dynamic MSI-X capabilities bits · 0b989c1e
      Leon Romanovsky 提交于
      These new fields declare the number of MSI-X vectors that is possible to
      allocate on the VF through PF configuration.
      
      Value must be in range defined by min_dynamic_vf_msix_table_size and
      max_dynamic_vf_msix_table_size.
      
      The driver should continue to query its MSI-X table through PCI
      configuration header.
      
      Link: https://lore.kernel.org/linux-pci/20210314124256.70253-3-leon@kernel.orgAcked-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      0b989c1e
    • L
      PCI/IOV: Add sysfs MSI-X vector assignment interface · c3d5c2d9
      Leon Romanovsky 提交于
      A typical cloud provider SR-IOV use case is to create many VFs for use by
      guest VMs. The VFs may not be assigned to a VM until a customer requests a
      VM of a certain size, e.g., number of CPUs. A VF may need MSI-X vectors
      proportional to the number of CPUs in the VM, but there is no standard way
      to change the number of MSI-X vectors supported by a VF.
      
      Some Mellanox ConnectX devices support dynamic assignment of MSI-X vectors
      to SR-IOV VFs. This can be done by the PF driver after VFs are enabled,
      and it can be done without affecting VFs that are already in use. The
      hardware supports a limited pool of MSI-X vectors that can be assigned to
      the PF or to individual VFs.  This is device-specific behavior that
      requires support in the PF driver.
      
      Add a read-only "sriov_vf_total_msix" sysfs file for the PF and a writable
      "sriov_vf_msix_count" file for each VF. Management software may use these
      to learn how many MSI-X vectors are available and to dynamically assign
      them to VFs before the VFs are passed through to a VM.
      
      If the PF driver implements the ->sriov_get_vf_total_msix() callback,
      "sriov_vf_total_msix" contains the total number of MSI-X vectors available
      for distribution among VFs.
      
      If no driver is bound to the VF, writing "N" to "sriov_vf_msix_count" uses
      the PF driver ->sriov_set_msix_vec_count() callback to assign "N" MSI-X
      vectors to the VF.  When a VF driver subsequently reads the MSI-X Message
      Control register, it will see the new Table Size "N".
      
      Link: https://lore.kernel.org/linux-pci/20210314124256.70253-2-leon@kernel.orgAcked-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      c3d5c2d9
  7. 02 4月, 2021 1 次提交
  8. 27 3月, 2021 1 次提交
  9. 26 3月, 2021 1 次提交
    • M
      RDMA: Support more than 255 rdma ports · 1fb7f897
      Mark Bloch 提交于
      Current code uses many different types when dealing with a port of a RDMA
      device: u8, unsigned int and u32. Switch to u32 to clean up the logic.
      
      This allows us to make (at least) the core view consistent and use the
      same type. Unfortunately not all places can be converted. Many uverbs
      functions expect port to be u8 so keep those places in order not to break
      UAPIs.  HW/Spec defined values must also not be changed.
      
      With the switch to u32 we now can support devices with more than 255
      ports. U32_MAX is reserved to make control logic a bit easier to deal
      with. As a device with U32_MAX ports probably isn't going to happen any
      time soon this seems like a non issue.
      
      When a device with more than 255 ports is created uverbs will report the
      RDMA device as having 255 ports as this is the max currently supported.
      
      The verbs interface is not changed yet because the IBTA spec limits the
      port size in too many places to be u8 and all applications that relies in
      verbs won't be able to cope with this change. At this stage, we are
      extending the interfaces that are using vendor channel solely
      
      Once the limitation is lifted mlx5 in switchdev mode will be able to have
      thousands of SFs created by the device. As the only instance of an RDMA
      device that reports more than 255 ports will be a representor device and
      it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other
      ULPs aren't effected by this change and their sysfs/interfaces that are
      exposes to userspace can remain unchanged.
      
      While here cleanup some alignment issues and remove unneeded sanity
      checks (mainly in rdmavt),
      
      Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.orgSigned-off-by: NMark Bloch <mbloch@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
      1fb7f897
  10. 23 3月, 2021 1 次提交
  11. 13 3月, 2021 5 次提交
  12. 12 3月, 2021 5 次提交
  13. 11 3月, 2021 1 次提交
  14. 05 3月, 2021 1 次提交
    • J
      kernel: provide create_io_thread() helper · cc440e87
      Jens Axboe 提交于
      Provide a generic helper for setting up an io_uring worker. Returns a
      task_struct so that the caller can do whatever setup is needed, then call
      wake_up_new_task() to kick it into gear.
      
      Add a kernel_clone_args member, io_thread, which tells copy_process() to
      mark the task with PF_IO_WORKER.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      cc440e87
  15. 04 3月, 2021 3 次提交
  16. 03 3月, 2021 2 次提交
    • J
      swap: fix swapfile read/write offset · caf6912f
      Jens Axboe 提交于
      We're not factoring in the start of the file for where to write and
      read the swapfile, which leads to very unfortunate side effects of
      writing where we should not be...
      
      Fixes: 48d15436 ("mm: remove get_swap_bio")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      caf6912f
    • D
      KVM: x86/xen: Add support for vCPU runstate information · 30b5c851
      David Woodhouse 提交于
      This is how Xen guests do steal time accounting. The hypervisor records
      the amount of time spent in each of running/runnable/blocked/offline
      states.
      
      In the Xen accounting, a vCPU is still in state RUNSTATE_running while
      in Xen for a hypercall or I/O trap, etc. Only if Xen explicitly schedules
      does the state become RUNSTATE_blocked. In KVM this means that even when
      the vCPU exits the kvm_run loop, the state remains RUNSTATE_running.
      
      The VMM can explicitly set the vCPU to RUNSTATE_blocked by using the
      KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT attribute, and can also use
      KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST to retrospectively add a given
      amount of time to the blocked state and subtract it from the running
      state.
      
      The state_entry_time corresponds to get_kvmclock_ns() at the time the
      vCPU entered the current state, and the total times of all four states
      should always add up to state_entry_time.
      Co-developed-by: NJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: NJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Message-Id: <20210301125309.874953-2-dwmw2@infradead.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      30b5c851
  17. 02 3月, 2021 4 次提交
  18. 27 2月, 2021 2 次提交