提交 · 88c5d0a2b9b043d0e000a4f64a28313e57d96a9c · openeuler / Kernel

21 7月, 2021 1 次提交

dmaengine: idxd: fix submission race window · 6b4b87f2

由 Dave Jiang 提交于 7月 14, 2021

Konstantin observed that when descriptors are submitted, the descriptor is
added to the pending list after the submission. This creates a race window
with the slight possibility that the descriptor can complete before it
gets added to the pending list and this window would cause the completion
handler to miss processing the descriptor.

To address the issue, the addition of the descriptor to the pending list
must be done before it gets submitted to the hardware. However, submitting
to swq with ENQCMDS instruction can cause a failure with the condition of
either wq is full or wq is not "active".

With the descriptor allocation being the gate to the wq capacity, it is not
possible to hit a retry with ENQCMDS submission to the swq. The only
possible failure can happen is when wq is no longer "active" due to hw
error and therefore we are moving towards taking down the portal. Given
this is a rare condition and there's no longer concern over I/O
performance, the driver can walk the completion lists in order to retrieve
and abort the descriptor.

The error path will set the descriptor to aborted status. It will take the
work list lock to prevent further processing of worklist. It will do a
delete_all on the pending llist to retrieve all descriptors on the pending
llist. The delete_all action does not require a lock. It will walk through
the acquired llist to find the aborted descriptor while add all remaining
descriptors to the work list since it holds the lock. If it does not find
the aborted descriptor on the llist, it will walk through the work
list. And if it still does not find the descriptor, then it means the
interrupt handler has removed the desc from the llist but is pending on
the work list lock and will process it once the error path releases the
lock.

Fixes: eb15e715 ("dmaengine: idxd: add interrupt handle request and release support")
Reported-by: NKonstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162628855747.360485.10101925573082466530.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

6b4b87f2

14 7月, 2021 2 次提交

dmaengine: idxd: assign MSIX vectors to each WQ rather than roundrobin · 6cfd9e62

由 Dave Jiang 提交于 6月 24, 2021

IOPS increased when changing MSIX vector to per WQ from roundrobin.
Allows descriptor to be completed by the submitter improves caching
locality.
Suggested-by: NKonstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Acked-by: NKonstantin Ananyev <konstantin.ananyev@intel.com>
Link: https://lore.kernel.org/r/162456717326.1130457.15258077196523268356.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

6cfd9e62

dmanegine: idxd: cleanup all device related bits after disabling device · 0dcfe41e

由 Dave Jiang 提交于 6月 04, 2021

The previous state cleanup patch only performed wq state cleanups. This
does not go far enough as when device is disabled or reset, the state
for groups and engines must also be cleaned up. Add additional state
cleanup beyond wq cleanup. Tie those cleanups directly to device
disable and reset, and wq disable and reset.

Fixes: da32b28c ("dmaengine: idxd: cleanup workqueue config after disabling")
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162285154108.2096632.5572805472362321307.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

0dcfe41e

26 4月, 2021 1 次提交

dmaengine: idxd: Add IDXD performance monitor support · 81dd4d4d

由 Tom Zanussi 提交于 4月 24, 2021

Implement the IDXD performance monitor capability (named 'perfmon' in
the DSA (Data Streaming Accelerator) spec [1]), which supports the
collection of information about key events occurring during DSA and
IAX (Intel Analytics Accelerator) device execution, to assist in
performance tuning and debugging.

The idxd perfmon support is implemented as part of the IDXD driver and
interfaces with the Linux perf framework.  It has several features in
common with the existing uncore pmu support:

  - it does not support sampling
  - does not support per-thread counting

However it also has some unique features not present in the core and
uncore support:

  - all general-purpose counters are identical, thus no event constraints
  - operation is always system-wide

While the core perf subsystem assumes that all counters are by default
per-cpu, the uncore pmus are socket-scoped and use a cpu mask to
restrict counting to one cpu from each socket.  IDXD counters use a
similar strategy but expand the scope even further; since IDXD
counters are system-wide and can be read from any cpu, the IDXD perf
driver picks a single cpu to do the work (with cpu hotplug notifiers
to choose a different cpu if the chosen one is taken off-line).

More specifically, the perf userspace tool by default opens a counter
for each cpu for an event.  However, if it finds a cpumask file
associated with the pmu under sysfs, as is the case with the uncore
pmus, it will open counters only on the cpus specified by the cpumask.
Since perfmon only needs to open a single counter per event for a
given IDXD device, the perfmon driver will create a sysfs cpumask file
for the device and insert the first cpu of the system into it.  When a
user uses perf to open an event, perf will open a single counter on
the cpu specified by the cpu mask.  This amounts to the default
system-wide rather than per-cpu counting mentioned previously for
perfmon pmu events.  In order to keep the cpu mask up-to-date, the
driver implements cpu hotplug support for multiple devices, as IDXD
usually enumerates and registers more than one idxd device.

The perfmon driver implements basic perfmon hardware capability
discovery and configuration, and is initialized by the IDXD driver's
probe function.  During initialization, the driver retrieves the total
number of supported performance counters, the pmu ID, and the device
type from idxd device, and registers itself under the Linux perf
framework.

The perf userspace tool can be used to monitor single or multiple
events depending on the given configuration, as well as event groups,
which are also supported by the perfmon driver.  The user configures
events using the perf tool command-line interface by specifying the
event and corresponding event category, along with an optional set of
filters that can be used to restrict counting to specific work queues,
traffic classes, page and transfer sizes, and engines (See [1] for
specifics).

With the configuration specified by the user, the perf tool issues a
system call passing that information to the kernel, which uses it to
initialize the specified event(s).  The event(s) are opened and
started, and following termination of the perf command, they're
stopped.  At that point, the perfmon driver will read the latest count
for the event(s), calculate the difference between the latest counter
values and previously tracked counter values, and display the final
incremental count as the event count for the cycle.  An overflow
handler registered on the IDXD irq path is used to account for counter
overflows, which are signaled by an overflow interrupt.

Below are a couple of examples of perf usage for monitoring DSA events.

The following monitors all events in the 'engine' category.  Becuuse
no filters are specified, this captures all engine events for the
workload, which in this case is 19 iterations of the work generated by
the kernel dmatest module.

Details describing the events can be found in Appendix D of [1],
Performance Monitoring Events, but briefly they are:

  event 0x1:  total input data processed, in 32-byte units
  event 0x2:  total data written, in 32-byte units
  event 0x4:  number of work descriptors that read the source
  event 0x8:  number of work descriptors that write the destination
  event 0x10: number of work descriptors dispatched from batch descriptors
  event 0x20: number of work descriptors dispatched from work queues

 # perf stat -e dsa0/event=0x1,event_category=0x1/,
                dsa0/event=0x2,event_category=0x1/,
		dsa0/event=0x4,event_category=0x1/,
		dsa0/event=0x8,event_category=0x1/,
		dsa0/event=0x10,event_category=0x1/,
		dsa0/event=0x20,event_category=0x1/
		  modprobe dmatest channel=dma0chan0 timeout=2000
		  iterations=19 run=1 wait=1

     Performance counter stats for 'system wide':

                 5,332      dsa0/event=0x1,event_category=0x1/
                 5,327      dsa0/event=0x2,event_category=0x1/
                    19      dsa0/event=0x4,event_category=0x1/
                    19      dsa0/event=0x8,event_category=0x1/
                     0      dsa0/event=0x10,event_category=0x1/
                    19      dsa0/event=0x20,event_category=0x1/

          21.977436186 seconds time elapsed

The command below illustrates filter usage with a simple example.  It
specifies that MEM_MOVE operations should be counted for the DSA
device dsa0 (event 0x8 corresponds to the EV_MEM_MOVE event - Number
of Memory Move Descriptors, which is part of event category 0x3 -
Operations. The detailed category and event IDs are available in
Appendix D, Performance Monitoring Events, of [1]).  In addition to
the event and event category, a number of filters are also specified
(the detailed filter values are available in Chapter 6.4 (Filter
Support) of [1]), which will restrict counting to only those events
that meet all of the filter criteria.  In this case, the filters
specify that only MEM_MOVE operations that are serviced by work queue
wq0 and specifically engine number engine0 and traffic class tc0
having sizes between 0 and 4k and page size of between 0 and 1G result
in a counter hit; anything else will be filtered out and not appear in
the final count.  Note that filters are optional - any filter not
specified is assumed to be all ones and will pass anything.

 # perf stat -e dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
                filter_eng=0x1,event=0x8,event_category=0x3/
		  modprobe dmatest channel=dma0chan0 timeout=2000
		  iterations=19 run=1 wait=1

     Performance counter stats for 'system wide':

       19      dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
               filter_eng=0x1,event=0x8,event_category=0x3/

          21.865914091 seconds time elapsed

The output above reflects that the unspecified workload resulted in
the counting of 19 MEM_MOVE operation events that met the filter
criteria.

[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html

[ Based on work originally by Jing Lin. ]
Reviewed-by: NDave Jiang <dave.jiang@intel.com>
Reviewed-by: NKan Liang <kan.liang@linux.intel.com>
Signed-off-by: NTom Zanussi <tom.zanussi@linux.intel.com>
Link: https://lore.kernel.org/r/0c5080a7d541904c4ad42b848c76a1ce056ddac7.1619276133.git.zanussi@kernel.orgSigned-off-by: NVinod Koul <vkoul@kernel.org>

81dd4d4d

24 4月, 2021 6 次提交

dmaengine: idxd: remove MSIX masking for interrupt handlers · a1610461

由 Dave Jiang 提交于 4月 20, 2021

Remove interrupt masking and just let the hard irq handler keep
firing for new events. This is less of a performance impact vs
the MMIO readback inside the pci_msi_{mask,unmas}_irq(). Especially
with a loaded system those flushes can be stuck behind large amounts
of MMIO writes to flush. When guest kernel is running on top of VFIO
mdev, mask/unmask causes a vmexit each time and is not desirable.
Suggested-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/161894523436.3210025.1834640110556139277.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

a1610461

dmaengine: idxd: device cmd should use dedicated lock · 53b2ee7f

由 Dave Jiang 提交于 4月 20, 2021

Create a dedicated lock for device command operations. Put the device
command operation under finer grained locking instead of using the
idxd->dev_lock.
Suggested-by: NSanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894525685.3210132.16160045731436382560.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

53b2ee7f

dmaengine: idxd: support reporting of halt interrupt · 5b0c68c4

由 Dave Jiang 提交于 4月 20, 2021

Unmask the halt error interrupt so it gets reported to the interrupt
handler. When halt state interrupt is received, quiesce the kernel
WQs and unmap the portals to stop submission.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894441167.3202472.9485946398140619501.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

5b0c68c4

dmaengine: idxd: add interrupt handle request and release support · eb15e715

由 Dave Jiang 提交于 4月 20, 2021

DSA spec states that when Request Interrupt Handle and Release Interrupt
Handle command bits are set in the CMDCAP register, these device commands
must be supported by the driver.

The interrupt handle is programmed in a descriptor. When Request Interrupt
Handle is not supported, the interrupt handle is the index of the desired
entry in the MSI-X table. When the command is supported, driver must use
the command to obtain a handle to be programmed in the submitted
descriptor.

A requested handle may be revoked. After the handle is revoked, any use of
the handle will result in Invalid Interrupt Handle error.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894439422.3202472.17579543737810265471.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

eb15e715

dmaengine: idxd: add support for readonly config mode · 8c66bbdc

由 Dave Jiang 提交于 4月 20, 2021

The read-only configuration mode is defined by the DSA spec as a mode of
the device WQ configuration. When GENCAP register bit 31 is set to 0,
the device is in RO mode and group configuration and some fields of the
workqueue configuration registers are read-only and reflect the fixed
configuration of the device. Add support for RO mode. The driver will
load the values from the registers directly setup all the internally
cached data structures based on the device configuration.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894438847.3202472.6317563824045432727.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

8c66bbdc

dmaengine: idxd: add percpu_ref to descriptor submission path · 93a40a6d

由 Dave Jiang 提交于 4月 20, 2021

Current submission path has no way to restrict the submitter from
stop submiting on shutdown path or wq disable path. This provides a way to
quiesce the submission path.

Modeling after 'struct reqeust_queue' usage of percpu_ref. One of the
abilities of per_cpu reference counting is the ability to stop new
references from being taken while awaiting outstanding references to be
dropped. On wq shutdown, we want to block any new submissions to the kernel
workqueue and quiesce before disabling. The percpu_ref allows us to block
any new submissions and wait for any current submission calls to finish
submitting to the workqueue.

A percpu_ref is embedded in each idxd_wq context to allow control for
individual wq. The wq->wq_active counter is elevated before calling
movdir64b() or enqcmds() to submit a descriptor to the wq and dropped once
the submission call completes. The function is gated by
percpu_ref_tryget_live(). On shutdown with percpu_ref_kill() called, any
new submission would be blocked from acquiring a ref and failed. Once all
references are dropped for the wq, shutdown can continue.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161894438293.3202472.14894701611500822232.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

93a40a6d

20 4月, 2021 9 次提交

dmaengine: idxd: remove detection of device type · 435b512d

由 Dave Jiang 提交于 4月 15, 2021

Move all static data type for per device type to an idxd_driver_data data
structure. The data can be attached to the pci_device_id and provided by
the pci probe function. This removes a lot of unnecessary type detection
and setup code.
Suggested-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852988924.2203940.2787590808682466398.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

435b512d

dmaengine: idxd: iax bus removal · 4b73e4eb

由 Dave Jiang 提交于 4月 15, 2021

There is no need to have an additional bus for the IAX device. The removal
of IAX will change user ABI as /sys/bus/iax will no longer exist.
The iax device will be moved to the dsa bus. The device id for dsa and
iax will now be combined rather than unique for each device type in order
to accommodate the iax devices. This is in preparation for fixing the
sub-driver code for idxd. There's no hardware deployment for Sapphire
Rapids platform yet, which means that users have no reason to have
developed scripts against this ABI. There is some exposure to
released versions of accel-config, but those are being fixed up and
an accel-config upgrade is reasonable to get IAX support. As far as
accel-config is concerned IAX support starts when these devices appear
under /sys/bus/dsa, and old accel-config just assumes that an empty /
missing /sys/bus/iax just means a lack of platform support.

Fixes: f25b4638 ("dmaengine: idxd: add IAX configuration support in the IDXD driver")
Suggested-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852988298.2203940.4529909758034944428.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

4b73e4eb

dmaengine: idxd: fix cdev setup and free device lifetime issues · 04922b74

由 Dave Jiang 提交于 4月 15, 2021

The char device setup and cleanup has device lifetime issues regarding when
parts are initialized and cleaned up. The initialization of struct device is
done incorrectly. device_initialize() needs to be called on the 'struct
device' and then additional changes can be added. The ->release() function
needs to be setup via device_type before dev_set_name() to allow proper
cleanup. The change re-parents the cdev under the wq->conf_dev to get
natural reference inheritance. No known dependency on the old device path exists.
Reported-by: NJason Gunthorpe <jgg@nvidia.com>
Fixes: 42d279f9 ("dmaengine: idxd: add char driver to expose submission portal to userland")
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/161852987721.2203940.1478218825576630810.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

04922b74

dmaengine: idxd: fix group conf_dev lifetime · defe49f9

由 Dave Jiang 提交于 4月 15, 2021

Remove devm_* allocation and fix group->conf_dev 'struct device'
lifetime. Address issues flagged by CONFIG_DEBUG_KOBJECT_RELEASE.
Add release functions in order to free the allocated memory at the
group->conf_dev destruction time.
Reported-by: NJason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d560 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852987144.2203940.8830315575880047.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

defe49f9

dmaengine: idxd: fix engine conf_dev lifetime · 75b91130

由 Dave Jiang 提交于 4月 15, 2021

Remove devm_* allocation and fix engine->conf_dev 'struct device'
lifetime. Address issues flagged by CONFIG_DEBUG_KOBJECT_RELEASE.
Add release functions in order to free the allocated memory at the
engine conf_dev destruction time.
Reported-by: NJason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d560 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852986460.2203940.16603218225412118431.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

75b91130

dmaengine: idxd: fix wq conf_dev 'struct device' lifetime · 7c5dd23e

由 Dave Jiang 提交于 4月 15, 2021

Remove devm_* allocation and fix wq->conf_dev 'struct device' lifetime.
Address issues flagged by CONFIG_DEBUG_KOBJECT_RELEASE. Add release
functions in order to free the allocated memory for the wq context at
device destruction time.
Reported-by: NJason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d560 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852985907.2203940.6840120734115043753.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

7c5dd23e

dmaengine: idxd: fix idxd conf_dev 'struct device' lifetime · 47c16ac2

由 Dave Jiang 提交于 4月 15, 2021

The devm managed lifetime is incompatible with 'struct device' objects that
resides in idxd context. This is one of the series that clean up the idxd
driver 'struct device' lifetime. Fix idxd->conf_dev 'struct device'
lifetime. Address issues flagged by CONFIG_DEBUG_KOBJECT_RELEASE.
Add release functions in order to free the allocated memory at the
appropriate time.
Reported-by: NJason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d560 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161852985319.2203940.4650791514462735368.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

47c16ac2

dmaengine: idxd: cleanup pci interrupt vector allocation management · 5fc8e85f

由 Dave Jiang 提交于 4月 15, 2021

The devm managed lifetime is incompatible with 'struct device' objects that
resides in idxd context. This is one of the series that clean up the idxd
driver 'struct device' lifetime. Remove devm managed pci interrupt vectors
and replace with unmanged allocators.
Reported-by: NJason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d560 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/161852983563.2203940.8116028229124776669.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

5fc8e85f

dmaengine: idxd: fix dma device lifetime · 39786285

由 Dave Jiang 提交于 4月 15, 2021

The devm managed lifetime is incompatible with 'struct device' objects that
resides in idxd context. This is one of the series that clean up the idxd
driver 'struct device' lifetime. Remove embedding of dma_device and dma_chan
in idxd since it's not the only interface that idxd will use. The freeing of
the dma_device will be managed by the ->release() function.
Reported-by: NJason Gunthorpe <jgg@nvidia.com>
Fixes: bfe1d560 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/161852983001.2203940.14817017492384561719.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

39786285

13 4月, 2021 2 次提交

dmaengine: idxd: fix wq cleanup of WQCFG registers · ea9aadc0

由 Dave Jiang 提交于 4月 12, 2021

A pre-release silicon erratum workaround where wq reset does not clear
WQCFG registers was leaked into upstream code. Use wq reset command
instead of blasting the MMIO region. This also address an issue where
we clobber registers in future devices.

Fixes: da32b28c ("dmaengine: idxd: cleanup workqueue config after disabling")
Reported-by: NShreenivaas Devarajan <shreenivaas.devarajan@intel.com>
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161824330020.881560.16375921906426627033.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

ea9aadc0

dmaengine: idxd: clear MSIX permission entry on shutdown · 6df0e6c5

由 Dave Jiang 提交于 4月 12, 2021

Add disabling/clearing of MSIX permission entries on device shutdown to
mirror the enabling of the MSIX entries on probe. Current code left the
MSIX enabled and the pasid entries still programmed at device shutdown.

Fixes: 8e50d392 ("dmaengine: idxd: Add shared workqueue support")
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161824457969.882533.6020239898682672311.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

6df0e6c5

03 2月, 2021 1 次提交

dmaengine: idxd: check device state before issue command · 89e3becd

由 Dave Jiang 提交于 2月 01, 2021

Add device state check before executing command. Without the check the
command can be issued while device is in halt state and causes the driver to
block while waiting for the completion of the command.
Reported-by: NSanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Tested-by: NSanjay Kumar <sanjay.k.kumar@intel.com>
Fixes: 0d5c10b4 ("dmaengine: idxd: add work queue drain support")
Link: https://lore.kernel.org/r/161219313921.2976211.12222625226450097465.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

89e3becd

11 12月, 2020 1 次提交

dmaengine: idxd: add IAX configuration support in the IDXD driver · f25b4638

由 Dave Jiang 提交于 11月 17, 2020

Add support to allow configuration of Intel Analytics Accelerator (IAX) in
addition to the Intel Data Streaming Accelerator (DSA). The IAX hardware
has the same configuration interface as DSA. The main difference
is the type of operations it performs. We can support the DSA and
IAX devices on the same driver with some tweaks.

IAX has a 64B completion record that needs to be 64B aligned, as opposed to
a 32B completion record that is 32B aligned for DSA. IAX also does not
support token management.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160564555488.1834439.4261958859935360473.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

f25b4638

10 12月, 2020 1 次提交

dmaengine: idxd: add ATS disable knob for work queues · 92de5fa2

由 Dave Jiang 提交于 11月 13, 2020

With the DSA spec 1.1 update, a knob to disable ATS for individually is
introduced. Add enabling code to allow a system admin to make the
configuration through sysfs.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160530810593.1288392.2561048329116529566.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

92de5fa2

30 10月, 2020 3 次提交

dmaengine: idxd: Clean up descriptors with fault error · e4f4d8cd

由 Dave Jiang 提交于 10月 27, 2020

Add code to "complete" a descriptor when the descriptor or its completion
address hit a fault error when SVA mode is being used. This error can be
triggered due to bad programming by the user. A lock is introduced in order
to protect the descriptor completion lists since the fault handler will run
from the system work queue after being scheduled in the interrupt handler.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/160382008092.3911367.12766483427643278985.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

e4f4d8cd

dmaengine: idxd: Add shared workqueue support · 8e50d392

由 Dave Jiang 提交于 10月 27, 2020

Add shared workqueue support that includes the support of Shared Virtual
memory (SVM) or in similar terms On Demand Paging (ODP). The shared
workqueue uses the enqcmds command in kernel and will respond with retry if
the workqueue is full. Shared workqueue only works when there is PASID
support from the IOMMU.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/160382007499.3911367.26043087963708134.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

8e50d392

dmaengine: idxd: fix wq config registers offset programming · d98793b5

由 Dave Jiang 提交于 10月 27, 2020

DSA spec v1.1 [1] updated to include a stride size register for WQ
configuration that will specify how much space is reserved for the WQ
configuration register set. This change is expected to be in the final
gen1 DSA hardware. Fix the driver to use WQCFG_OFFSET() for all WQ
offset calculation and fixup WQCFG_OFFSET() to use the new calculated
wq size.

[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html

Fixes: bfe1d560 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160383444959.48058.14249265538404901781.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

d98793b5

28 10月, 2020 1 次提交

dmaengine: idxd: fix wq config registers offset programming · 484f910e

由 Dave Jiang 提交于 10月 27, 2020

DSA spec v1.1 [1] updated to include a stride size register for WQ
configuration that will specify how much space is reserved for the WQ
configuration register set. This change is expected to be in the final
gen1 DSA hardware. Fix the driver to use WQCFG_OFFSET() for all WQ
offset calculation and fixup WQCFG_OFFSET() to use the new calculated
wq size.

[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html

Fixes: bfe1d560 ("dmaengine: idxd: Init and probe for Intel data accelerators")
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/160383444959.48058.14249265538404901781.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

484f910e

03 9月, 2020 3 次提交

dmaengine: idxd: add command status to idxd sysfs attribute · ff18de55

由 Dave Jiang 提交于 8月 28, 2020

Export admin command status to sysfs attribute in order to allow user to
retrieve configuration error. Allows user tooling to retrieve the command
error and provide more user friendly error messages.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/159865278770.29455.8026892329182750127.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

ff18de55

dmaengine: idxd: add support for configurable max wq batch size · e7184b15

由 Dave Jiang 提交于 8月 28, 2020

Add sysfs attribute max_batch_size to wq in order to allow the max batch
size configured on a per wq basis. Add support code to configure
the valid user input on wq enable. This is a performance tuning
parameter.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/159865273617.29141.4383066301730821749.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

e7184b15

dmaengine: idxd: add support for configurable max wq xfer size · d7aad555

由 Dave Jiang 提交于 8月 28, 2020

Add sysfs attribute max_xfer_size to wq in order to allow the max xfer
size configured on a per wq basis. Add support code to configure
the valid user input on wq enable. This is a performance tuning
parameter.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/159865265404.29141.3049399618578194052.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

d7aad555

13 7月, 2020 2 次提交

dmaengine: idxd: move idxd interrupt handling to mask instead of ignore · 4548a6ad

由 Dave Jiang 提交于 6月 26, 2020

Switch driver to use MSIX mask and unmask instead of the ignore bit.
When ignore bit is cleared, we must issue an MMIO read to ensure writes
have all arrived and check and process any additional completions. The
ignore bit does not queue up any pending MSIX interrupts. The mask bit
however does. Use API call from interrupt subsystem to mask MSIX
interrupt since the hardware does not have convenient mask bit register.
Suggested-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/159319517621.70410.11816465052708900506.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

4548a6ad

dmaengine: idxd: add work queue drain support · 0d5c10b4

由 Dave Jiang 提交于 6月 26, 2020

Add wq drain support. When a wq is being released, it needs to wait for
all in-flight operation to complete. A device control function
idxd_wq_drain() has been added to facilitate this. A wq drain call
is added to the char dev on release to make sure all user operations are
complete. A wq drain is also added before the wq is being disabled.

A drain command can take an unpredictable period of time. Interrupt support
for device commands is added to allow waiting on the command to
finish. If a previous command is in progress, the new submitter can block
until the current command is finished before proceeding. The interrupt
based submission will submit the command and then wait until a command
completion interrupt happens to complete. All commands are moved to the
interrupt based command submission except for the device reset during
probe, which will be polled.

Fixes: 42d279f9 ("dmaengine: idxd: add char driver to expose submission portal to userland")
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/159319502515.69593.13451647706946040301.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

0d5c10b4

02 7月, 2020 1 次提交

dmaengine: idxd: cleanup workqueue config after disabling · da32b28c

由 Dave Jiang 提交于 6月 25, 2020

After disabling a device, we should clean up the internal state for
the wqs and zero out the configuration registers. Without doing so can cause
issues when the user reprogram the wqs.

Fixes: c52ca478 ("dmaengine: idxd: add configuration component of driver")
Reported-by: NYixin Zhang <yixin.zhang@intel.com>
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Tested-by: NYixin Zhang <yixin.zhang@intel.com>
Link: https://lore.kernel.org/r/159311264246.1198.11955791213681679428.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

da32b28c

24 6月, 2020 1 次提交

dmaengine: idxd: move submission to sbitmap_queue · 0705107f

由 Dave Jiang 提交于 6月 15, 2020

Kill the percpu-rwsem for work submission in favor of an sbitmap_queue.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/159225446631.68253.8860709181621260997.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

0705107f

24 1月, 2020 5 次提交

dmaengine: idxd: add char driver to expose submission portal to userland · 42d279f9

由 Dave Jiang 提交于 1月 21, 2020

Create a char device region that will allow acquisition of user portals in
order to allow applications to submit DMA operations. A char device will be
created per work queue that gets exposed. The workqueue type "user"
is used to mark a work queue for user char device. For example if the
workqueue 0 of DSA device 0 is marked for char device, then a device node
of /dev/dsa/wq0.0 will be created.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965026985.73301.976523230037106742.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

42d279f9

dmaengine: idxd: connect idxd to dmaengine subsystem · 8f47d1a5

由 Dave Jiang 提交于 1月 21, 2020

Add plumbing for dmaengine subsystem connection. The driver register a DMA
device per DSA device. The channels are dynamically registered when a
workqueue is configured to be "kernel:dmanegine" type.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965026376.73301.13867988830650740445.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

8f47d1a5

dmaengine: idxd: add descriptor manipulation routines · d1dfe5b8

由 Dave Jiang 提交于 1月 21, 2020

This commit adds helper functions for DSA descriptor allocation,
submission, and free operations.
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965025757.73301.12692876585357550065.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

d1dfe5b8

dmaengine: idxd: add configuration component of driver · c52ca478

由 Dave Jiang 提交于 1月 21, 2020

The device is left unconfigured when the driver is loaded. Various
components are configured via the driver sysfs attributes. Once
configuration is done, the device can be enabled by writing the device name
to the bind attribute of the device driver sysfs. Disabling can be done
similarly. Also the individual work queues can also be enabled and disabled
through the bind/unbind attributes. A constructed hierarchy is created
through the struct device framework in order to provide appropriate
configuration points and device state and status. This hierarchy is
presented off the virtual DSA bus.

i.e. /sys/bus/dsa/...
Signed-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965024585.73301.6431413676230150589.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

c52ca478

dmaengine: idxd: Init and probe for Intel data accelerators · bfe1d560

由 Dave Jiang 提交于 1月 21, 2020

The idxd driver introduces the Intel Data Stream Accelerator [1] that will
be available on future Intel Xeon CPUs. One of the kernel access
point for the driver is through the dmaengine subsystem. It will initially
provide the DMA copy service to the kernel.

Some of the main functionality introduced with this accelerator
are: shared virtual memory (SVM) support, and descriptor submission using
Intel CPU instructions movdir64b and enqcmds. There will be additional
accelerator devices that share the same driver with variations to
capabilities.

This commit introduces the probe and initialization component of the
driver.

[1]: https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specificationSigned-off-by: NDave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965023991.73301.6186843973135311580.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NVinod Koul <vkoul@kernel.org>

bfe1d560

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功