提交 b15cdca5 编写于 作者: S Sean Paul

Merge remote-tracking branch 'airlied/drm-next' into drm-misc-next-fixes

Backmerging airlied/drm-next
无相关合并请求
......@@ -5,7 +5,7 @@ with HDMI output and the HVS (Hardware Video Scaler) for compositing
display planes.
Required properties for VC4:
- compatible: Should be "brcm,bcm2835-vc4"
- compatible: Should be "brcm,bcm2835-vc4" or "brcm,cygnus-vc4"
Required properties for Pixel Valve:
- compatible: Should be one of "brcm,bcm2835-pixelvalve0",
......@@ -54,11 +54,14 @@ Required properties for VEC:
See bindings/interrupt-controller/brcm,bcm2835-armctrl-ic.txt
Required properties for V3D:
- compatible: Should be "brcm,bcm2835-v3d"
- compatible: Should be "brcm,bcm2835-v3d" or "brcm,cygnus-v3d"
- reg: Physical base address and length of the V3D's registers
- interrupts: The interrupt number
See bindings/interrupt-controller/brcm,bcm2835-armctrl-ic.txt
Optional properties for V3D:
- clocks: The clock the unit runs on
Required properties for DSI:
- compatible: Should be "brcm,bcm2835-dsi0" or "brcm,bcm2835-dsi1"
- reg: Physical base address and length of the DSI block's registers
......
......@@ -8,12 +8,13 @@ Required properties:
- compatible: value should be one of:
"samsung,exynos5433-decon", "samsung,exynos5433-decon-tv";
- reg: physical base address and length of the DECON registers set.
- interrupts: should contain a list of all DECON IP block interrupts in the
order: VSYNC, LCD_SYSTEM. The interrupt specifier format
depends on the interrupt controller used.
- interrupt-names: should contain the interrupt names: "vsync", "lcd_sys"
in the same order as they were listed in the interrupts
property.
- interrupt-names: should contain the interrupt names depending on mode of work:
video mode: "vsync",
command mode: "lcd_sys",
command mode with software trigger: "lcd_sys", "te".
- interrupts or interrupts-extended: list of interrupt specifiers corresponding
to names privided in interrupt-names, as described in
interrupt-controller/interrupts.txt
- clocks: must include clock specifiers corresponding to entries in the
clock-names property.
- clock-names: list of clock names sorted in the same order as the clocks
......
AU Optronics Corporation 31.5" FHD (1920x1080) TFT LCD panel
Required properties:
- compatible: should be "auo,p320hvn03"
- power-supply: as specified in the base binding
This binding is compatible with the simple-panel binding, which is specified
in simple-panel.txt in this directory.
Innolux P079ZCA 7.85" 768x1024 TFT LCD panel
Required properties:
- compatible: should be "innolux,p079zca"
- reg: DSI virtual channel of the peripheral
- power-supply: phandle of the regulator that provides the supply voltage
- enable-gpios: panel enable gpio
Optional properties:
- backlight: phandle of the backlight device attached to the panel
Example:
&mipi_dsi {
panel {
compatible = "innolux,p079zca";
reg = <0>;
power-supply = <...>;
backlight = <&backlight>;
enable-gpios = <&gpio1 13 GPIO_ACTIVE_HIGH>;
status = "okay";
};
};
NEC LCD Technologies, Ltd. 12.1" WXGA (1280x800) LVDS TFT LCD panel
Required properties:
- compatible: should be "nec,nl12880bc20-05"
- power-supply: as specified in the base binding
This binding is compatible with the simple-panel binding, which is specified
in simple-panel.txt in this directory.
NLT Technologies, Ltd. 15.6" FHD (1920x1080) LVDS TFT LCD panel
Required properties:
- compatible: should be "nlt,nl192108ac18-02d"
- power-supply: as specified in the base binding
This binding is compatible with the simple-panel binding, which is specified
in simple-panel.txt in this directory.
Samsung S6E3HA2 5.7" 1440x2560 AMOLED panel
Samsung S6E3HF2 5.65" 1600x2560 AMOLED panel
Required properties:
- compatible: "samsung,s6e3ha2"
- compatible: should be one of:
"samsung,s6e3ha2",
"samsung,s6e3hf2".
- reg: the virtual channel number of a DSI peripheral
- vdd3-supply: I/O voltage supply
- vci-supply: voltage supply for analog circuits
......
* STMicroelectronics STM32 lcd-tft display controller
- ltdc: lcd-tft display controller host
must be a sub-node of st-display-subsystem
Required properties:
- compatible: "st,stm32-ltdc"
- reg: Physical base address of the IP registers and length of memory mapped region.
- clocks: A list of phandle + clock-specifier pairs, one for each
entry in 'clock-names'.
- clock-names: A list of clock names. For ltdc it should contain:
- "lcd" for the clock feeding the output pixel clock & IP clock.
- resets: reset to be used by the device (defined by use of RCC macro).
Required nodes:
- Video port for RGB output.
Example:
/ {
...
soc {
...
ltdc: display-controller@40016800 {
compatible = "st,stm32-ltdc";
reg = <0x40016800 0x200>;
interrupts = <88>, <89>;
resets = <&rcc STM32F4_APB2_RESET(LTDC)>;
clocks = <&rcc 1 CLK_LCD>;
clock-names = "lcd";
port {
ltdc_out_rgb: endpoint {
};
};
};
};
};
......@@ -4,6 +4,44 @@ Allwinner A10 Display Pipeline
The Allwinner A10 Display pipeline is composed of several components
that are going to be documented below:
For the input port of all components up to the TCON in the display
pipeline, if there are multiple components, the local endpoint IDs
must correspond to the index of the upstream block. For example, if
the remote endpoint is Frontend 1, then the local endpoint ID must
be 1.
Conversely, for the output ports of the same group, the remote endpoint
ID must be the index of the local hardware block. If the local backend
is backend 1, then the remote endpoint ID must be 1.
HDMI Encoder
------------
The HDMI Encoder supports the HDMI video and audio outputs, and does
CEC. It is one end of the pipeline.
Required properties:
- compatible: value must be one of:
* allwinner,sun5i-a10s-hdmi
- reg: base address and size of memory-mapped region
- interrupts: interrupt associated to this IP
- clocks: phandles to the clocks feeding the HDMI encoder
* ahb: the HDMI interface clock
* mod: the HDMI module clock
* pll-0: the first video PLL
* pll-1: the second video PLL
- clock-names: the clock names mentioned above
- dmas: phandles to the DMA channels used by the HDMI encoder
* ddc-tx: The channel for DDC transmission
* ddc-rx: The channel for DDC reception
* audio-tx: The channel used for audio transmission
- dma-names: the channel names mentioned above
- ports: A ports node with endpoint definitions as defined in
Documentation/devicetree/bindings/media/video-interfaces.txt. The
first port should be the input endpoint. The second should be the
output, usually to an HDMI connector.
TV Encoder
----------
......@@ -31,6 +69,7 @@ Required properties:
* allwinner,sun6i-a31-tcon
* allwinner,sun6i-a31s-tcon
* allwinner,sun8i-a33-tcon
* allwinner,sun8i-v3s-tcon
- reg: base address and size of memory-mapped region
- interrupts: interrupt associated to this IP
- clocks: phandles to the clocks feeding the TCON. Three are needed:
......@@ -47,12 +86,15 @@ Required properties:
Documentation/devicetree/bindings/media/video-interfaces.txt. The
first port should be the input endpoint, the second one the output
The output should have two endpoints. The first is the block
connected to the TCON channel 0 (usually a panel or a bridge), the
second the block connected to the TCON channel 1 (usually the TV
encoder)
The output may have multiple endpoints. The TCON has two channels,
usually with the first channel being used for the panels interfaces
(RGB, LVDS, etc.), and the second being used for the outputs that
require another controller (TV Encoder, HDMI, etc.). The endpoints
will take an extra property, allwinner,tcon-channel, to specify the
channel the endpoint is associated to. If that property is not
present, the endpoint number will be used as the channel number.
On SoCs other than the A33, there is one more clock required:
On SoCs other than the A33 and V3s, there is one more clock required:
- 'tcon-ch1': The clock driving the TCON channel 1
DRC
......@@ -138,6 +180,26 @@ Required properties:
Documentation/devicetree/bindings/media/video-interfaces.txt. The
first port should be the input endpoints, the second one the outputs
Display Engine 2.0 Mixer
------------------------
The DE2 mixer have many functionalities, currently only layer blending is
supported.
Required properties:
- compatible: value must be one of:
* allwinner,sun8i-v3s-de2-mixer
- reg: base address and size of the memory-mapped region.
- clocks: phandles to the clocks feeding the mixer
* bus: the mixer interface clock
* mod: the mixer module clock
- clock-names: the clock names mentioned above
- resets: phandles to the reset controllers driving the mixer
- ports: A ports node with endpoint definitions as defined in
Documentation/devicetree/bindings/media/video-interfaces.txt. The
first port should be the input endpoints, the second one the output
Display Engine Pipeline
-----------------------
......@@ -148,13 +210,15 @@ extra node.
Required properties:
- compatible: value must be one of:
* allwinner,sun5i-a10s-display-engine
* allwinner,sun5i-a13-display-engine
* allwinner,sun6i-a31-display-engine
* allwinner,sun6i-a31s-display-engine
* allwinner,sun8i-a33-display-engine
* allwinner,sun8i-v3s-display-engine
- allwinner,pipelines: list of phandle to the display engine
frontends available.
frontends (DE 1.0) or mixers (DE 2.0) available.
Example:
......@@ -173,6 +237,57 @@ panel: panel {
};
};
connector {
compatible = "hdmi-connector";
type = "a";
port {
hdmi_con_in: endpoint {
remote-endpoint = <&hdmi_out_con>;
};
};
};
hdmi: hdmi@01c16000 {
compatible = "allwinner,sun5i-a10s-hdmi";
reg = <0x01c16000 0x1000>;
interrupts = <58>;
clocks = <&ccu CLK_AHB_HDMI>, <&ccu CLK_HDMI>,
<&ccu CLK_PLL_VIDEO0_2X>,
<&ccu CLK_PLL_VIDEO1_2X>;
clock-names = "ahb", "mod", "pll-0", "pll-1";
dmas = <&dma SUN4I_DMA_NORMAL 16>,
<&dma SUN4I_DMA_NORMAL 16>,
<&dma SUN4I_DMA_DEDICATED 24>;
dma-names = "ddc-tx", "ddc-rx", "audio-tx";
status = "disabled";
ports {
#address-cells = <1>;
#size-cells = <0>;
port@0 {
#address-cells = <1>;
#size-cells = <0>;
reg = <0>;
hdmi_in_tcon0: endpoint {
remote-endpoint = <&tcon0_out_hdmi>;
};
};
port@1 {
#address-cells = <1>;
#size-cells = <0>;
reg = <1>;
hdmi_out_con: endpoint {
remote-endpoint = <&hdmi_con_in>;
};
};
};
};
tve0: tv-encoder@01c0a000 {
compatible = "allwinner,sun4i-a10-tv-encoder";
reg = <0x01c0a000 0x1000>;
......
......@@ -58,6 +58,18 @@ Required properties:
integer cells. The first cell is the offset of SYSCTRL register used
to control TV Encoder DAC power, and the second cell is the bit mask.
* VGA output device
Required properties:
- compatible: should be "zte,zx296718-vga"
- reg: Physical base address and length of the VGA device IO region
- interrupts : VGA interrupt number to CPU
- clocks: Phandle with clock-specifier pointing to VGA I2C clock.
- clock-names: Must be "i2c_wclk".
- zte,vga-power-control: the phandle to SYSCTRL block followed by two
integer cells. The first cell is the offset of SYSCTRL register used
to control VGA DAC power, and the second cell is the bit mask.
Example:
vou: vou@1440000 {
......@@ -81,6 +93,15 @@ vou: vou@1440000 {
"main_wclk", "aux_wclk";
};
vga: vga@8000 {
compatible = "zte,zx296718-vga";
reg = <0x8000 0x1000>;
interrupts = <GIC_SPI 86 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&topcrm VGA_I2C_WCLK>;
clock-names = "i2c_wclk";
zte,vga-power-control = <&sysctrl 0x170 0xe0>;
};
hdmi: hdmi@c000 {
compatible = "zte,zx296718-hdmi";
reg = <0xc000 0x4000>;
......
......@@ -219,6 +219,7 @@ nexbox Nexbox
newhaven Newhaven Display International
ni National Instruments
nintendo Nintendo
nlt NLT Technologies, Ltd.
nokia Nokia
nordic Nordic Semiconductor
nuvoton Nuvoton Technology Corporation
......
......@@ -98,6 +98,9 @@ DRIVER_ATOMIC
implement appropriate obj->atomic_get_property() vfuncs for any
modeset objects with driver specific properties.
DRIVER_SYNCOBJ
Driver support drm sync objects.
Major, Minor and Patchlevel
~~~~~~~~~~~~~~~~~~~~~~~~~~~
......@@ -149,60 +152,15 @@ Device Instance and Driver Handling
Driver Load
-----------
IRQ Registration
~~~~~~~~~~~~~~~~
The DRM core tries to facilitate IRQ handler registration and
unregistration by providing :c:func:`drm_irq_install()` and
:c:func:`drm_irq_uninstall()` functions. Those functions only
support a single interrupt per device, devices that use more than one
IRQs need to be handled manually.
Managed IRQ Registration
''''''''''''''''''''''''
:c:func:`drm_irq_install()` starts by calling the irq_preinstall
driver operation. The operation is optional and must make sure that the
interrupt will not get fired by clearing all pending interrupt flags or
disabling the interrupt.
The passed-in IRQ will then be requested by a call to
:c:func:`request_irq()`. If the DRIVER_IRQ_SHARED driver feature
flag is set, a shared (IRQF_SHARED) IRQ handler will be requested.
The IRQ handler function must be provided as the mandatory irq_handler
driver operation. It will get passed directly to
:c:func:`request_irq()` and thus has the same prototype as all IRQ
handlers. It will get called with a pointer to the DRM device as the
second argument.
Finally the function calls the optional irq_postinstall driver
operation. The operation usually enables interrupts (excluding the
vblank interrupt, which is enabled separately), but drivers may choose
to enable/disable interrupts at a different time.
:c:func:`drm_irq_uninstall()` is similarly used to uninstall an
IRQ handler. It starts by waking up all processes waiting on a vblank
interrupt to make sure they don't hang, and then calls the optional
irq_uninstall driver operation. The operation must disable all hardware
interrupts. Finally the function frees the IRQ by calling
:c:func:`free_irq()`.
Manual IRQ Registration
'''''''''''''''''''''''
Drivers that require multiple interrupt handlers can't use the managed
IRQ registration functions. In that case IRQs must be registered and
unregistered manually (usually with the :c:func:`request_irq()` and
:c:func:`free_irq()` functions, or their :c:func:`devm_request_irq()` and
:c:func:`devm_free_irq()` equivalents).
When manually registering IRQs, drivers must not set the
DRIVER_HAVE_IRQ driver feature flag, and must not provide the
irq_handler driver operation. They must set the :c:type:`struct
drm_device <drm_device>` irq_enabled field to 1 upon
registration of the IRQs, and clear it to 0 after unregistering the
IRQs.
IRQ Helper Library
~~~~~~~~~~~~~~~~~~
.. kernel-doc:: drivers/gpu/drm/drm_irq.c
:doc: irq helpers
.. kernel-doc:: drivers/gpu/drm/drm_irq.c
:export:
Memory Manager Initialization
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -143,6 +143,12 @@ Bridge Helper Reference
.. kernel-doc:: drivers/gpu/drm/drm_bridge.c
:export:
Panel-Bridge Helper Reference
-----------------------------
.. kernel-doc:: drivers/gpu/drm/bridge/panel.c
:export:
.. _drm_panel_helper:
Panel Helper Reference
......
......@@ -612,8 +612,8 @@ operation handler.
Vertical Blanking and Interrupt Handling Functions Reference
------------------------------------------------------------
.. kernel-doc:: include/drm/drm_irq.h
.. kernel-doc:: include/drm/drm_vblank.h
:internal:
.. kernel-doc:: drivers/gpu/drm/drm_irq.c
.. kernel-doc:: drivers/gpu/drm/drm_vblank.c
:export:
......@@ -484,3 +484,15 @@ DRM Cache Handling
.. kernel-doc:: drivers/gpu/drm/drm_cache.c
:export:
DRM Sync Objects
===========================
.. kernel-doc:: drivers/gpu/drm/drm_syncobj.c
:doc: Overview
.. kernel-doc:: include/drm/drm_syncobj.h
:export:
.. kernel-doc:: drivers/gpu/drm/drm_syncobj.c
:export:
......@@ -12,6 +12,8 @@ Linux GPU Driver Developer's Guide
drm-uapi
i915
meson
pl111
tegra
tinydrm
vc4
vga-switcheroo
......
==========================================
drm/pl111 ARM PrimeCell PL111 CLCD Driver
==========================================
.. kernel-doc:: drivers/gpu/drm/pl111/pl111_drv.c
:doc: ARM PrimeCell PL111 CLCD Driver
===============================================
drm/tegra NVIDIA Tegra GPU and display driver
===============================================
NVIDIA Tegra SoCs support a set of display, graphics and video functions via
the host1x controller. host1x supplies command streams, gathered from a push
buffer provided directly by the CPU, to its clients via channels. Software,
or blocks amongst themselves, can use syncpoints for synchronization.
Up until, but not including, Tegra124 (aka Tegra K1) the drm/tegra driver
supports the built-in GPU, comprised of the gr2d and gr3d engines. Starting
with Tegra124 the GPU is based on the NVIDIA desktop GPU architecture and
supported by the drm/nouveau driver.
The drm/tegra driver supports NVIDIA Tegra SoC generations since Tegra20. It
has three parts:
- A host1x driver that provides infrastructure and access to the host1x
services.
- A KMS driver that supports the display controllers as well as a number of
outputs, such as RGB, HDMI, DSI, and DisplayPort.
- A set of custom userspace IOCTLs that can be used to submit jobs to the
GPU and video engines via host1x.
Driver Infrastructure
=====================
The various host1x clients need to be bound together into a logical device in
order to expose their functionality to users. The infrastructure that supports
this is implemented in the host1x driver. When a driver is registered with the
infrastructure it provides a list of compatible strings specifying the devices
that it needs. The infrastructure creates a logical device and scan the device
tree for matching device nodes, adding the required clients to a list. Drivers
for individual clients register with the infrastructure as well and are added
to the logical host1x device.
Once all clients are available, the infrastructure will initialize the logical
device using a driver-provided function which will set up the bits specific to
the subsystem and in turn initialize each of its clients.
Similarly, when one of the clients is unregistered, the infrastructure will
destroy the logical device by calling back into the driver, which ensures that
the subsystem specific bits are torn down and the clients destroyed in turn.
Host1x Infrastructure Reference
-------------------------------
.. kernel-doc:: include/linux/host1x.h
.. kernel-doc:: drivers/gpu/host1x/bus.c
:export:
Host1x Syncpoint Reference
--------------------------
.. kernel-doc:: drivers/gpu/host1x/syncpt.c
:export:
KMS driver
==========
The display hardware has remained mostly backwards compatible over the various
Tegra SoC generations, up until Tegra186 which introduces several changes that
make it difficult to support with a parameterized driver.
Display Controllers
-------------------
Tegra SoCs have two display controllers, each of which can be associated with
zero or more outputs. Outputs can also share a single display controller, but
only if they run with compatible display timings. Two display controllers can
also share a single framebuffer, allowing cloned configurations even if modes
on two outputs don't match. A display controller is modelled as a CRTC in KMS
terms.
On Tegra186, the number of display controllers has been increased to three. A
display controller can no longer drive all of the outputs. While two of these
controllers can drive both DSI outputs and both SOR outputs, the third cannot
drive any DSI.
Windows
~~~~~~~
A display controller controls a set of windows that can be used to composite
multiple buffers onto the screen. While it is possible to assign arbitrary Z
ordering to individual windows (by programming the corresponding blending
registers), this is currently not supported by the driver. Instead, it will
assume a fixed Z ordering of the windows (window A is the root window, that
is, the lowest, while windows B and C are overlaid on top of window A). The
overlay windows support multiple pixel formats and can automatically convert
from YUV to RGB at scanout time. This makes them useful for displaying video
content. In KMS, each window is modelled as a plane. Each display controller
has a hardware cursor that is exposed as a cursor plane.
Outputs
-------
The type and number of supported outputs varies between Tegra SoC generations.
All generations support at least HDMI. While earlier generations supported the
very simple RGB interfaces (one per display controller), recent generations no
longer do and instead provide standard interfaces such as DSI and eDP/DP.
Outputs are modelled as a composite encoder/connector pair.
RGB/LVDS
~~~~~~~~
This interface is no longer available since Tegra124. It has been replaced by
the more standard DSI and eDP interfaces.
HDMI
~~~~
HDMI is supported on all Tegra SoCs. Starting with Tegra210, HDMI is provided
by the versatile SOR output, which supports eDP, DP and HDMI. The SOR is able
to support HDMI 2.0, though support for this is currently not merged.
DSI
~~~
Although Tegra has supported DSI since Tegra30, the controller has changed in
several ways in Tegra114. Since none of the publicly available development
boards prior to Dalmore (Tegra114) have made use of DSI, only Tegra114 and
later are supported by the drm/tegra driver.
eDP/DP
~~~~~~
eDP was first introduced in Tegra124 where it was used to drive the display
panel for notebook form factors. Tegra210 added support for full DisplayPort
support, though this is currently not implemented in the drm/tegra driver.
Userspace Interface
===================
The userspace interface provided by drm/tegra allows applications to create
GEM buffers, access and control syncpoints as well as submit command streams
to host1x.
GEM Buffers
-----------
The ``DRM_IOCTL_TEGRA_GEM_CREATE`` IOCTL is used to create a GEM buffer object
with Tegra-specific flags. This is useful for buffers that should be tiled, or
that are to be scanned out upside down (useful for 3D content).
After a GEM buffer object has been created, its memory can be mapped by an
application using the mmap offset returned by the ``DRM_IOCTL_TEGRA_GEM_MMAP``
IOCTL.
Syncpoints
----------
The current value of a syncpoint can be obtained by executing the
``DRM_IOCTL_TEGRA_SYNCPT_READ`` IOCTL. Incrementing the syncpoint is achieved
using the ``DRM_IOCTL_TEGRA_SYNCPT_INCR`` IOCTL.
Userspace can also request blocking on a syncpoint. To do so, it needs to
execute the ``DRM_IOCTL_TEGRA_SYNCPT_WAIT`` IOCTL, specifying the value of
the syncpoint to wait for. The kernel will release the application when the
syncpoint reaches that value or after a specified timeout.
Command Stream Submission
-------------------------
Before an application can submit command streams to host1x it needs to open a
channel to an engine using the ``DRM_IOCTL_TEGRA_OPEN_CHANNEL`` IOCTL. Client
IDs are used to identify the target of the channel. When a channel is no
longer needed, it can be closed using the ``DRM_IOCTL_TEGRA_CLOSE_CHANNEL``
IOCTL. To retrieve the syncpoint associated with a channel, an application
can use the ``DRM_IOCTL_TEGRA_GET_SYNCPT``.
After opening a channel, submitting command streams is easy. The application
writes commands into the memory backing a GEM buffer object and passes these
to the ``DRM_IOCTL_TEGRA_SUBMIT`` IOCTL along with various other parameters,
such as the syncpoints or relocations used in the job submission.
......@@ -177,19 +177,6 @@ following drivers still use ``struct_mutex``: ``msm``, ``omapdrm`` and
Contact: Daniel Vetter, respective driver maintainers
Switch to drm_connector_list_iter for any connector_list walking
----------------------------------------------------------------
Connectors can be hotplugged, and we now have a special list of helpers to walk
the connector_list in a race-free fashion, without incurring deadlocks on
mutexes and other fun stuff.
Unfortunately most drivers are not converted yet. At least all those supporting
DP MST hotplug should be converted, since for those drivers the difference
matters. See drm_for_each_connector_iter() vs. drm_for_each_connector().
Contact: Daniel Vetter
Core refactorings
=================
......
Sync File API Guide
~~~~~~~~~~~~~~~~~~~
===================
Sync File API Guide
===================
Gustavo Padovan
<gustavo at padovan dot org>
:Author: Gustavo Padovan <gustavo at padovan dot org>
This document serves as a guide for device drivers writers on what the
sync_file API is, and how drivers can support it. Sync file is the carrier of
......@@ -46,16 +46,17 @@ Creating Sync Files
When a driver needs to send an out-fence userspace it creates a sync_file.
Interface:
Interface::
struct sync_file *sync_file_create(struct dma_fence *fence);
The caller pass the out-fence and gets back the sync_file. That is just the
first step, next it needs to install an fd on sync_file->file. So it gets an
fd:
fd::
fd = get_unused_fd_flags(O_CLOEXEC);
and installs it on sync_file->file:
and installs it on sync_file->file::
fd_install(fd, sync_file->file);
......@@ -71,7 +72,8 @@ When userspace needs to send an in-fence to the driver it passes file descriptor
of the Sync File to the kernel. The kernel can then retrieve the fences
from it.
Interface:
Interface::
struct dma_fence *sync_file_get_fence(int fd);
......@@ -79,5 +81,6 @@ The returned reference is owned by the caller and must be disposed of
afterwards using dma_fence_put(). In case of error, a NULL is returned instead.
References:
[1] struct sync_file in include/linux/sync_file.h
[2] All interfaces mentioned above defined in include/linux/sync_file.h
1. struct sync_file in include/linux/sync_file.h
2. All interfaces mentioned above defined in include/linux/sync_file.h
......@@ -4235,6 +4235,12 @@ F: include/drm/drm*
F: include/uapi/drm/drm*
F: include/linux/vga*
DRM DRIVER FOR ARM PL111 CLCD
M: Eric Anholt <eric@anholt.net>
T: git git://anongit.freedesktop.org/drm/drm-misc
S: Supported
F: drivers/gpu/drm/pl111/
DRM DRIVER FOR AST SERVER GRAPHICS CHIPS
M: Dave Airlie <airlied@redhat.com>
S: Odd Fixes
......@@ -4242,6 +4248,8 @@ F: drivers/gpu/drm/ast/
DRM DRIVERS FOR BRIDGE CHIPS
M: Archit Taneja <architt@codeaurora.org>
M: Andrzej Hajda <a.hajda@samsung.com>
R: Laurent Pinchart <Laurent.pinchart@ideasonboard.com>
S: Maintained
T: git git://anongit.freedesktop.org/drm/drm-misc
F: drivers/gpu/drm/bridge/
......@@ -4498,6 +4506,17 @@ S: Maintained
F: drivers/gpu/drm/sti
F: Documentation/devicetree/bindings/display/st,stih4xx.txt
DRM DRIVERS FOR STM
M: Yannick Fertre <yannick.fertre@st.com>
M: Philippe Cornu <philippe.cornu@st.com>
M: Benjamin Gaignard <benjamin.gaignard@linaro.org>
M: Vincent Abriou <vincent.abriou@st.com>
L: dri-devel@lists.freedesktop.org
T: git git://anongit.freedesktop.org/drm/drm-misc
S: Maintained
F: drivers/gpu/drm/stm
F: Documentation/devicetree/bindings/display/st,stm32-ltdc.txt
DRM DRIVER FOR TDFX VIDEO CARDS
S: Orphan / Obsolete
F: drivers/gpu/drm/tdfx/
......
......@@ -558,8 +558,8 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
if (WARN_ON(!dmabuf || !dev))
return ERR_PTR(-EINVAL);
attach = kzalloc(sizeof(struct dma_buf_attachment), GFP_KERNEL);
if (attach == NULL)
attach = kzalloc(sizeof(*attach), GFP_KERNEL);
if (!attach)
return ERR_PTR(-ENOMEM);
attach->dev = dev;
......@@ -1122,9 +1122,7 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
attach_count = 0;
list_for_each_entry(attach_obj, &buf_obj->attachments, node) {
seq_puts(s, "\t");
seq_printf(s, "%s\n", dev_name(attach_obj->dev));
seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
attach_count++;
}
......
......@@ -402,6 +402,11 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout)
}
}
if (!timeout) {
ret = 0;
goto out;
}
cb.base.func = dma_fence_default_wait_cb;
cb.task = current;
list_add(&cb.base.node, &fence->cb_list);
......
......@@ -110,7 +110,7 @@ static void sync_print_fence(struct seq_file *s,
}
}
seq_puts(s, "\n");
seq_putc(s, '\n');
}
static void sync_print_obj(struct seq_file *s, struct sync_timeline *obj)
......@@ -132,9 +132,11 @@ static void sync_print_obj(struct seq_file *s, struct sync_timeline *obj)
static void sync_print_sync_file(struct seq_file *s,
struct sync_file *sync_file)
{
char buf[128];
int i;
seq_printf(s, "[%p] %s: %s\n", sync_file, sync_file->name,
seq_printf(s, "[%p] %s: %s\n", sync_file,
sync_file_get_name(sync_file, buf, sizeof(buf)),
sync_status_str(dma_fence_get_status(sync_file->fence)));
if (dma_fence_is_array(sync_file->fence)) {
......@@ -161,7 +163,7 @@ static int sync_debugfs_show(struct seq_file *s, void *unused)
sync_timeline_list);
sync_print_obj(s, obj);
seq_puts(s, "\n");
seq_putc(s, '\n');
}
spin_unlock_irqrestore(&sync_timeline_list_lock, flags);
......@@ -173,7 +175,7 @@ static int sync_debugfs_show(struct seq_file *s, void *unused)
container_of(pos, struct sync_file, sync_file_list);
sync_print_sync_file(s, sync_file);
seq_puts(s, "\n");
seq_putc(s, '\n');
}
spin_unlock_irqrestore(&sync_file_list_lock, flags);
return 0;
......
......@@ -41,8 +41,6 @@ static struct sync_file *sync_file_alloc(void)
if (IS_ERR(sync_file->file))
goto err;
kref_init(&sync_file->kref);
init_waitqueue_head(&sync_file->wq);
INIT_LIST_HEAD(&sync_file->cb.node);
......@@ -82,11 +80,6 @@ struct sync_file *sync_file_create(struct dma_fence *fence)
sync_file->fence = dma_fence_get(fence);
snprintf(sync_file->name, sizeof(sync_file->name), "%s-%s%llu-%d",
fence->ops->get_driver_name(fence),
fence->ops->get_timeline_name(fence), fence->context,
fence->seqno);
return sync_file;
}
EXPORT_SYMBOL(sync_file_create);
......@@ -131,6 +124,36 @@ struct dma_fence *sync_file_get_fence(int fd)
}
EXPORT_SYMBOL(sync_file_get_fence);
/**
* sync_file_get_name - get the name of the sync_file
* @sync_file: sync_file to get the fence from
* @buf: destination buffer to copy sync_file name into
* @len: available size of destination buffer.
*
* Each sync_file may have a name assigned either by the user (when merging
* sync_files together) or created from the fence it contains. In the latter
* case construction of the name is deferred until use, and so requires
* sync_file_get_name().
*
* Returns: a string representing the name.
*/
char *sync_file_get_name(struct sync_file *sync_file, char *buf, int len)
{
if (sync_file->user_name[0]) {
strlcpy(buf, sync_file->user_name, len);
} else {
struct dma_fence *fence = sync_file->fence;
snprintf(buf, len, "%s-%s%llu-%d",
fence->ops->get_driver_name(fence),
fence->ops->get_timeline_name(fence),
fence->context,
fence->seqno);
}
return buf;
}
static int sync_file_set_fence(struct sync_file *sync_file,
struct dma_fence **fences, int num_fences)
{
......@@ -268,7 +291,7 @@ static struct sync_file *sync_file_merge(const char *name, struct sync_file *a,
goto err;
}
strlcpy(sync_file->name, name, sizeof(sync_file->name));
strlcpy(sync_file->user_name, name, sizeof(sync_file->user_name));
return sync_file;
err:
......@@ -277,22 +300,15 @@ static struct sync_file *sync_file_merge(const char *name, struct sync_file *a,
}
static void sync_file_free(struct kref *kref)
static int sync_file_release(struct inode *inode, struct file *file)
{
struct sync_file *sync_file = container_of(kref, struct sync_file,
kref);
struct sync_file *sync_file = file->private_data;
if (test_bit(POLL_ENABLED, &sync_file->fence->flags))
dma_fence_remove_callback(sync_file->fence, &sync_file->cb);
dma_fence_put(sync_file->fence);
kfree(sync_file);
}
static int sync_file_release(struct inode *inode, struct file *file)
{
struct sync_file *sync_file = file->private_data;
kref_put(&sync_file->kref, sync_file_free);
return 0;
}
......@@ -422,7 +438,7 @@ static long sync_file_ioctl_fence_info(struct sync_file *sync_file,
}
no_fences:
strlcpy(info.name, sync_file->name, sizeof(info.name));
sync_file_get_name(sync_file, info.name, sizeof(info.name));
info.status = dma_fence_is_signaled(sync_file->fence);
info.num_fences = num_fences;
......
......@@ -246,6 +246,8 @@ source "drivers/gpu/drm/fsl-dcu/Kconfig"
source "drivers/gpu/drm/tegra/Kconfig"
source "drivers/gpu/drm/stm/Kconfig"
source "drivers/gpu/drm/panel/Kconfig"
source "drivers/gpu/drm/bridge/Kconfig"
......@@ -274,6 +276,8 @@ source "drivers/gpu/drm/meson/Kconfig"
source "drivers/gpu/drm/tinydrm/Kconfig"
source "drivers/gpu/drm/pl111/Kconfig"
# Keep legacy drivers last
menuconfig DRM_LEGACY
......
......@@ -16,7 +16,8 @@ drm-y := drm_auth.o drm_bufs.o drm_cache.o \
drm_framebuffer.o drm_connector.o drm_blend.o \
drm_encoder.o drm_mode_object.o drm_property.o \
drm_plane.o drm_color_mgmt.o drm_print.o \
drm_dumb_buffers.o drm_mode_config.o
drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \
drm_syncobj.o
drm-$(CONFIG_DRM_LIB_RANDOM) += lib/drm_random.o
drm-$(CONFIG_DRM_VM) += drm_vm.o
......@@ -34,6 +35,7 @@ drm_kms_helper-y := drm_crtc_helper.o drm_dp_helper.o drm_probe_helper.o \
drm_simple_kms_helper.o drm_modeset_helper.o \
drm_scdc_helper.o
drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o
drm_kms_helper-$(CONFIG_DRM_LOAD_EDID_FIRMWARE) += drm_edid_load.o
drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o
drm_kms_helper-$(CONFIG_DRM_KMS_CMA_HELPER) += drm_fb_cma_helper.o
......@@ -82,6 +84,7 @@ obj-$(CONFIG_DRM_BOCHS) += bochs/
obj-$(CONFIG_DRM_VIRTIO_GPU) += virtio/
obj-$(CONFIG_DRM_MSM) += msm/
obj-$(CONFIG_DRM_TEGRA) += tegra/
obj-$(CONFIG_DRM_STM) += stm/
obj-$(CONFIG_DRM_STI) += sti/
obj-$(CONFIG_DRM_IMX) += imx/
obj-$(CONFIG_DRM_MEDIATEK) += mediatek/
......@@ -96,3 +99,4 @@ obj-y += hisilicon/
obj-$(CONFIG_DRM_ZTE) += zte/
obj-$(CONFIG_DRM_MXSFB) += mxsfb/
obj-$(CONFIG_DRM_TINYDRM) += tinydrm/
obj-$(CONFIG_DRM_PL111) += pl111/
......@@ -5,15 +5,23 @@ config DRM_AMDGPU_SI
Choose this option if you want to enable experimental support
for SI asics.
SI is already supported in radeon. Experimental support for SI
in amdgpu will be disabled by default and is still provided by
radeon. Use module options to override this:
radeon.si_support=0 amdgpu.si_support=1
config DRM_AMDGPU_CIK
bool "Enable amdgpu support for CIK parts"
depends on DRM_AMDGPU
help
Choose this option if you want to enable experimental support
for CIK asics.
Choose this option if you want to enable support for CIK asics.
CIK is already supported in radeon. Support for CIK in amdgpu
will be disabled by default and is still provided by radeon.
Use module options to override this:
CIK is already supported in radeon. CIK support in amdgpu
is for experimentation and testing.
radeon.cik_support=0 amdgpu.cik_support=1
config DRM_AMDGPU_USERPTR
bool "Always enable userptr write support"
......
......@@ -4,7 +4,7 @@
FULL_AMD_PATH=$(src)/..
ccflags-y := -Iinclude/drm -I$(FULL_AMD_PATH)/include/asic_reg \
ccflags-y := -I$(FULL_AMD_PATH)/include/asic_reg \
-I$(FULL_AMD_PATH)/include \
-I$(FULL_AMD_PATH)/amdgpu \
-I$(FULL_AMD_PATH)/scheduler \
......@@ -24,7 +24,8 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
atombios_encoders.o amdgpu_sa.o atombios_i2c.o \
amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o amdgpu_atomfirmware.o
amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o amdgpu_atomfirmware.o \
amdgpu_queue_mgr.o
# add asic specific block
amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
......@@ -34,7 +35,7 @@ amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
amdgpu-$(CONFIG_DRM_AMDGPU_SI)+= si.o gmc_v6_0.o gfx_v6_0.o si_ih.o si_dma.o dce_v6_0.o si_dpm.o si_smc.o
amdgpu-y += \
vi.o mxgpu_vi.o nbio_v6_1.o soc15.o mxgpu_ai.o
vi.o mxgpu_vi.o nbio_v6_1.o soc15.o mxgpu_ai.o nbio_v7_0.o
# add GMC block
amdgpu-y += \
......@@ -54,7 +55,8 @@ amdgpu-y += \
# add PSP block
amdgpu-y += \
amdgpu_psp.o \
psp_v3_1.o
psp_v3_1.o \
psp_v10_0.o
# add SMC block
amdgpu-y += \
......@@ -92,6 +94,11 @@ amdgpu-y += \
vce_v3_0.o \
vce_v4_0.o
# add VCN block
amdgpu-y += \
amdgpu_vcn.o \
vcn_v1_0.o
# add amdkfd interfaces
amdgpu-y += \
amdgpu_amdkfd.o \
......
......@@ -36,16 +36,18 @@
#include <linux/hashtable.h>
#include <linux/dma-fence.h>
#include <ttm/ttm_bo_api.h>
#include <ttm/ttm_bo_driver.h>
#include <ttm/ttm_placement.h>
#include <ttm/ttm_module.h>
#include <ttm/ttm_execbuf_util.h>
#include <drm/ttm/ttm_bo_api.h>
#include <drm/ttm/ttm_bo_driver.h>
#include <drm/ttm/ttm_placement.h>
#include <drm/ttm/ttm_module.h>
#include <drm/ttm/ttm_execbuf_util.h>
#include <drm/drmP.h>
#include <drm/drm_gem.h>
#include <drm/amdgpu_drm.h>
#include <kgd_kfd_interface.h>
#include "amd_shared.h"
#include "amdgpu_mode.h"
#include "amdgpu_ih.h"
......@@ -62,6 +64,7 @@
#include "amdgpu_acp.h"
#include "amdgpu_uvd.h"
#include "amdgpu_vce.h"
#include "amdgpu_vcn.h"
#include "gpu_scheduler.h"
#include "amdgpu_virt.h"
......@@ -92,6 +95,7 @@ extern int amdgpu_vm_size;
extern int amdgpu_vm_block_size;
extern int amdgpu_vm_fault_stop;
extern int amdgpu_vm_debug;
extern int amdgpu_vm_update_mode;
extern int amdgpu_sched_jobs;
extern int amdgpu_sched_hw_submission;
extern int amdgpu_no_evict;
......@@ -109,6 +113,15 @@ extern int amdgpu_prim_buf_per_se;
extern int amdgpu_pos_buf_per_se;
extern int amdgpu_cntl_sb_buf_per_se;
extern int amdgpu_param_buf_per_se;
extern int amdgpu_job_hang_limit;
extern int amdgpu_lbpw;
#ifdef CONFIG_DRM_AMDGPU_SI
extern int amdgpu_si_support;
#endif
#ifdef CONFIG_DRM_AMDGPU_CIK
extern int amdgpu_cik_support;
#endif
#define AMDGPU_DEFAULT_GTT_SIZE_MB 3072ULL /* 3GB by default */
#define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS 3000
......@@ -305,8 +318,8 @@ struct amdgpu_gart_funcs {
/* set pte flags based per asic */
uint64_t (*get_vm_pte_flags)(struct amdgpu_device *adev,
uint32_t flags);
/* adjust mc addr in fb for APU case */
u64 (*adjust_mc_addr)(struct amdgpu_device *adev, u64 addr);
/* get the pde for a given mc addr */
u64 (*get_vm_pde)(struct amdgpu_device *adev, u64 addr);
uint32_t (*get_invalidate_req)(unsigned int vm_id);
};
......@@ -554,7 +567,7 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
int amdgpu_gart_init(struct amdgpu_device *adev);
void amdgpu_gart_fini(struct amdgpu_device *adev);
void amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
int pages);
int amdgpu_gart_bind(struct amdgpu_device *adev, uint64_t offset,
int pages, struct page **pagelist,
......@@ -602,6 +615,7 @@ struct amdgpu_mc {
uint32_t srbm_soft_reset;
struct amdgpu_mode_mc_save save;
bool prt_warning;
uint64_t stolen_size;
/* apertures */
u64 shared_aperture_start;
u64 shared_aperture_end;
......@@ -771,6 +785,29 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct amdgpu_ring *ring,
struct amd_sched_entity *entity, void *owner,
struct dma_fence **f);
/*
* Queue manager
*/
struct amdgpu_queue_mapper {
int hw_ip;
struct mutex lock;
/* protected by lock */
struct amdgpu_ring *queue_map[AMDGPU_MAX_RINGS];
};
struct amdgpu_queue_mgr {
struct amdgpu_queue_mapper mapper[AMDGPU_MAX_IP_NUM];
};
int amdgpu_queue_mgr_init(struct amdgpu_device *adev,
struct amdgpu_queue_mgr *mgr);
int amdgpu_queue_mgr_fini(struct amdgpu_device *adev,
struct amdgpu_queue_mgr *mgr);
int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
struct amdgpu_queue_mgr *mgr,
int hw_ip, int instance, int ring,
struct amdgpu_ring **out_ring);
/*
* context related structures
*/
......@@ -784,6 +821,7 @@ struct amdgpu_ctx_ring {
struct amdgpu_ctx {
struct kref refcount;
struct amdgpu_device *adev;
struct amdgpu_queue_mgr queue_mgr;
unsigned reset_counter;
spinlock_t ring_lock;
struct dma_fence **fences;
......@@ -822,6 +860,7 @@ struct amdgpu_fpriv {
struct mutex bo_list_lock;
struct idr bo_list_handles;
struct amdgpu_ctx_mgr ctx_mgr;
u32 vram_lost_counter;
};
/*
......@@ -830,6 +869,8 @@ struct amdgpu_fpriv {
struct amdgpu_bo_list {
struct mutex lock;
struct rcu_head rhead;
struct kref refcount;
struct amdgpu_bo *gds_obj;
struct amdgpu_bo *gws_obj;
struct amdgpu_bo *oa_obj;
......@@ -893,20 +934,26 @@ struct amdgpu_rlc {
u32 *register_restore;
};
#define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES
struct amdgpu_mec {
struct amdgpu_bo *hpd_eop_obj;
u64 hpd_eop_gpu_addr;
struct amdgpu_bo *mec_fw_obj;
u64 mec_fw_gpu_addr;
u32 num_pipe;
u32 num_mec;
u32 num_queue;
u32 num_pipe_per_mec;
u32 num_queue_per_pipe;
void *mqd_backup[AMDGPU_MAX_COMPUTE_RINGS + 1];
/* These are the resources for which amdgpu takes ownership */
DECLARE_BITMAP(queue_bitmap, AMDGPU_MAX_COMPUTE_QUEUES);
};
struct amdgpu_kiq {
u64 eop_gpu_addr;
struct amdgpu_bo *eop_obj;
struct mutex ring_mutex;
struct amdgpu_ring ring;
struct amdgpu_irq_src irq;
};
......@@ -983,7 +1030,10 @@ struct amdgpu_gfx_config {
struct amdgpu_cu_info {
uint32_t number; /* total active CU number */
uint32_t ao_cu_mask;
uint32_t max_waves_per_simd;
uint32_t wave_front_size;
uint32_t max_scratch_slots_per_cu;
uint32_t lds_size;
uint32_t bitmap[4][4];
};
......@@ -1061,6 +1111,8 @@ struct amdgpu_gfx {
uint32_t grbm_soft_reset;
uint32_t srbm_soft_reset;
bool in_reset;
/* s3/s4 mask */
bool in_suspend;
/* NGG */
struct amdgpu_ngg ngg;
};
......@@ -1109,12 +1161,14 @@ struct amdgpu_cs_parser {
/* user fence */
struct amdgpu_bo_list_entry uf_entry;
unsigned num_post_dep_syncobjs;
struct drm_syncobj **post_dep_syncobjs;
};
#define AMDGPU_PREAMBLE_IB_PRESENT (1 << 0) /* bit set means command submit involves a preamble IB */
#define AMDGPU_PREAMBLE_IB_PRESENT_FIRST (1 << 1) /* bit set means preamble IB is first presented in belonging context */
#define AMDGPU_HAVE_CTX_SWITCH (1 << 2) /* bit set means context switch occured */
#define AMDGPU_VM_DOMAIN (1 << 3) /* bit set means in virtual memory context */
struct amdgpu_job {
struct amd_sched_job base;
......@@ -1122,6 +1176,8 @@ struct amdgpu_job {
struct amdgpu_vm *vm;
struct amdgpu_ring *ring;
struct amdgpu_sync sync;
struct amdgpu_sync dep_sync;
struct amdgpu_sync sched_sync;
struct amdgpu_ib *ibs;
struct dma_fence *fence; /* the hw fence */
uint32_t preamble_status;
......@@ -1129,7 +1185,6 @@ struct amdgpu_job {
void *owner;
uint64_t fence_ctx; /* the fence_context this job uses */
bool vm_needs_flush;
bool need_pipeline_sync;
unsigned vm_id;
uint64_t vm_pd_addr;
uint32_t gds_base, gds_size;
......@@ -1221,6 +1276,9 @@ struct amdgpu_firmware {
const struct amdgpu_psp_funcs *funcs;
struct amdgpu_bo *rbuf;
struct mutex mutex;
/* gpu info firmware data pointer */
const struct firmware *gpu_info_fw;
};
/*
......@@ -1296,7 +1354,6 @@ struct amdgpu_smumgr {
*/
struct amdgpu_allowed_register_entry {
uint32_t reg_offset;
bool untouched;
bool grbm_indexed;
};
......@@ -1424,6 +1481,7 @@ typedef void (*amdgpu_wreg_t)(struct amdgpu_device*, uint32_t, uint32_t);
typedef uint32_t (*amdgpu_block_rreg_t)(struct amdgpu_device*, uint32_t, uint32_t);
typedef void (*amdgpu_block_wreg_t)(struct amdgpu_device*, uint32_t, uint32_t, uint32_t);
#define AMDGPU_RESET_MAGIC_NUM 64
struct amdgpu_device {
struct device *dev;
struct drm_device *ddev;
......@@ -1523,7 +1581,9 @@ struct amdgpu_device {
atomic64_t gtt_usage;
atomic64_t num_bytes_moved;
atomic64_t num_evictions;
atomic64_t num_vram_cpu_page_faults;
atomic_t gpu_reset_counter;
atomic_t vram_lost_counter;
/* data for buffer migration throttling */
struct {
......@@ -1570,11 +1630,18 @@ struct amdgpu_device {
/* sdma */
struct amdgpu_sdma sdma;
/* uvd */
struct amdgpu_uvd uvd;
union {
struct {
/* uvd */
struct amdgpu_uvd uvd;
/* vce */
struct amdgpu_vce vce;
};
/* vce */
struct amdgpu_vce vce;
/* vcn */
struct amdgpu_vcn vcn;
};
/* firmwares */
struct amdgpu_firmware firmware;
......@@ -1598,6 +1665,9 @@ struct amdgpu_device {
/* amdkfd interface */
struct kfd_dev *kfd;
/* delayed work_func for deferring clockgating during resume */
struct delayed_work late_init_work;
struct amdgpu_virt virt;
/* link all shadow bo */
......@@ -1606,9 +1676,13 @@ struct amdgpu_device {
/* link all gtt */
spinlock_t gtt_list_lock;
struct list_head gtt_list;
/* keep an lru list of rings by HW IP */
struct list_head ring_lru_list;
spinlock_t ring_lru_list_lock;
/* record hw reset is performed */
bool has_hw_reset;
u8 reset_magic[AMDGPU_RESET_MAGIC_NUM];
};
......@@ -1617,7 +1691,6 @@ static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev)
return container_of(bdev, struct amdgpu_device, mman.bdev);
}
bool amdgpu_device_is_px(struct drm_device *dev);
int amdgpu_device_init(struct amdgpu_device *adev,
struct drm_device *ddev,
struct pci_dev *pdev,
......@@ -1733,30 +1806,31 @@ static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring, void *sr
unsigned occupied, chunk1, chunk2;
void *dst;
if (ring->count_dw < count_dw) {
if (unlikely(ring->count_dw < count_dw)) {
DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
} else {
occupied = ring->wptr & ring->buf_mask;
dst = (void *)&ring->ring[occupied];
chunk1 = ring->buf_mask + 1 - occupied;
chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
chunk2 = count_dw - chunk1;
chunk1 <<= 2;
chunk2 <<= 2;
if (chunk1)
memcpy(dst, src, chunk1);
if (chunk2) {
src += chunk1;
dst = (void *)ring->ring;
memcpy(dst, src, chunk2);
}
ring->wptr += count_dw;
ring->wptr &= ring->ptr_mask;
ring->count_dw -= count_dw;
return;
}
occupied = ring->wptr & ring->buf_mask;
dst = (void *)&ring->ring[occupied];
chunk1 = ring->buf_mask + 1 - occupied;
chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
chunk2 = count_dw - chunk1;
chunk1 <<= 2;
chunk2 <<= 2;
if (chunk1)
memcpy(dst, src, chunk1);
if (chunk2) {
src += chunk1;
dst = (void *)ring->ring;
memcpy(dst, src, chunk2);
}
ring->wptr += count_dw;
ring->wptr &= ring->ptr_mask;
ring->count_dw -= count_dw;
}
static inline struct amdgpu_sdma_instance *
......@@ -1792,6 +1866,7 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
#define amdgpu_asic_get_config_memsize(adev) (adev)->asic_funcs->get_config_memsize((adev))
#define amdgpu_gart_flush_gpu_tlb(adev, vmid) (adev)->gart.gart_funcs->flush_gpu_tlb((adev), (vmid))
#define amdgpu_gart_set_pte_pde(adev, pt, idx, addr, flags) (adev)->gart.gart_funcs->set_pte_pde((adev), (pt), (idx), (addr), (flags))
#define amdgpu_gart_get_vm_pde(adev, addr) (adev)->gart.gart_funcs->get_vm_pde((adev), (addr))
#define amdgpu_vm_copy_pte(adev, ib, pe, src, count) ((adev)->vm_manager.vm_pte_funcs->copy_pte((ib), (pe), (src), (count)))
#define amdgpu_vm_write_pte(adev, ib, pe, value, count, incr) ((adev)->vm_manager.vm_pte_funcs->write_pte((ib), (pe), (value), (count), (incr)))
#define amdgpu_vm_set_pte_pde(adev, ib, pe, addr, count, incr, flags) ((adev)->vm_manager.vm_pte_funcs->set_pte_pde((ib), (pe), (addr), (count), (incr), (flags)))
......@@ -1813,6 +1888,7 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
#define amdgpu_ring_emit_cntxcntl(r, d) (r)->funcs->emit_cntxcntl((r), (d))
#define amdgpu_ring_emit_rreg(r, d) (r)->funcs->emit_rreg((r), (d))
#define amdgpu_ring_emit_wreg(r, d, v) (r)->funcs->emit_wreg((r), (d), (v))
#define amdgpu_ring_emit_tmz(r, b) (r)->funcs->emit_tmz((r), (b))
#define amdgpu_ring_pad_ib(r, ib) ((r)->funcs->pad_ib((r), (ib)))
#define amdgpu_ring_init_cond_exec(r) (r)->funcs->init_cond_exec((r))
#define amdgpu_ring_patch_cond_exec(r,o) (r)->funcs->patch_cond_exec((r),(o))
......@@ -1849,9 +1925,6 @@ bool amdgpu_need_post(struct amdgpu_device *adev);
void amdgpu_update_display_priority(struct amdgpu_device *adev);
int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, void *data);
int amdgpu_cs_get_ring(struct amdgpu_device *adev, u32 ip_type,
u32 ip_instance, u32 ring,
struct amdgpu_ring **out_ring);
void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes);
void amdgpu_ttm_placement_from_domain(struct amdgpu_bo *abo, u32 domain);
bool amdgpu_ttm_bo_is_amdgpu_bo(struct ttm_buffer_object *bo);
......@@ -1900,6 +1973,8 @@ static inline bool amdgpu_has_atpx(void) { return false; }
extern const struct drm_ioctl_desc amdgpu_ioctls_kms[];
extern const int amdgpu_max_kms_ioctl;
bool amdgpu_kms_vram_lost(struct amdgpu_device *adev,
struct amdgpu_fpriv *fpriv);
int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags);
void amdgpu_driver_unload_kms(struct drm_device *dev);
void amdgpu_driver_lastclose_kms(struct drm_device *dev);
......@@ -1912,10 +1987,6 @@ int amdgpu_device_resume(struct drm_device *dev, bool resume, bool fbcon);
u32 amdgpu_get_vblank_counter_kms(struct drm_device *dev, unsigned int pipe);
int amdgpu_enable_vblank_kms(struct drm_device *dev, unsigned int pipe);
void amdgpu_disable_vblank_kms(struct drm_device *dev, unsigned int pipe);
int amdgpu_get_vblank_timestamp_kms(struct drm_device *dev, unsigned int pipe,
int *max_error,
struct timeval *vblank_time,
unsigned flags);
long amdgpu_kms_compat_ioctl(struct file *filp, unsigned int cmd,
unsigned long arg);
......
......@@ -24,6 +24,7 @@
#include "amd_shared.h"
#include <drm/drmP.h>
#include "amdgpu.h"
#include "amdgpu_gfx.h"
#include <linux/module.h>
const struct kfd2kgd_calls *kfd2kgd;
......@@ -60,9 +61,9 @@ int amdgpu_amdkfd_init(void)
return ret;
}
bool amdgpu_amdkfd_load_interface(struct amdgpu_device *rdev)
bool amdgpu_amdkfd_load_interface(struct amdgpu_device *adev)
{
switch (rdev->asic_type) {
switch (adev->asic_type) {
#ifdef CONFIG_DRM_AMDGPU_CIK
case CHIP_KAVERI:
kfd2kgd = amdgpu_amdkfd_gfx_7_get_functions();
......@@ -86,59 +87,83 @@ void amdgpu_amdkfd_fini(void)
}
}
void amdgpu_amdkfd_device_probe(struct amdgpu_device *rdev)
void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev)
{
if (kgd2kfd)
rdev->kfd = kgd2kfd->probe((struct kgd_dev *)rdev,
rdev->pdev, kfd2kgd);
adev->kfd = kgd2kfd->probe((struct kgd_dev *)adev,
adev->pdev, kfd2kgd);
}
void amdgpu_amdkfd_device_init(struct amdgpu_device *rdev)
void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
{
if (rdev->kfd) {
int i;
int last_valid_bit;
if (adev->kfd) {
struct kgd2kfd_shared_resources gpu_resources = {
.compute_vmid_bitmap = 0xFF00,
.first_compute_pipe = 1,
.compute_pipe_count = 4 - 1,
.num_mec = adev->gfx.mec.num_mec,
.num_pipe_per_mec = adev->gfx.mec.num_pipe_per_mec,
.num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe
};
amdgpu_doorbell_get_kfd_info(rdev,
/* this is going to have a few of the MSBs set that we need to
* clear */
bitmap_complement(gpu_resources.queue_bitmap,
adev->gfx.mec.queue_bitmap,
KGD_MAX_QUEUES);
/* remove the KIQ bit as well */
if (adev->gfx.kiq.ring.ready)
clear_bit(amdgpu_gfx_queue_to_bit(adev,
adev->gfx.kiq.ring.me - 1,
adev->gfx.kiq.ring.pipe,
adev->gfx.kiq.ring.queue),
gpu_resources.queue_bitmap);
/* According to linux/bitmap.h we shouldn't use bitmap_clear if
* nbits is not compile time constant */
last_valid_bit = adev->gfx.mec.num_mec
* adev->gfx.mec.num_pipe_per_mec
* adev->gfx.mec.num_queue_per_pipe;
for (i = last_valid_bit; i < KGD_MAX_QUEUES; ++i)
clear_bit(i, gpu_resources.queue_bitmap);
amdgpu_doorbell_get_kfd_info(adev,
&gpu_resources.doorbell_physical_address,
&gpu_resources.doorbell_aperture_size,
&gpu_resources.doorbell_start_offset);
kgd2kfd->device_init(rdev->kfd, &gpu_resources);
kgd2kfd->device_init(adev->kfd, &gpu_resources);
}
}
void amdgpu_amdkfd_device_fini(struct amdgpu_device *rdev)
void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
{
if (rdev->kfd) {
kgd2kfd->device_exit(rdev->kfd);
rdev->kfd = NULL;
if (adev->kfd) {
kgd2kfd->device_exit(adev->kfd);
adev->kfd = NULL;
}
}
void amdgpu_amdkfd_interrupt(struct amdgpu_device *rdev,
void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
const void *ih_ring_entry)
{
if (rdev->kfd)
kgd2kfd->interrupt(rdev->kfd, ih_ring_entry);
if (adev->kfd)
kgd2kfd->interrupt(adev->kfd, ih_ring_entry);
}
void amdgpu_amdkfd_suspend(struct amdgpu_device *rdev)
void amdgpu_amdkfd_suspend(struct amdgpu_device *adev)
{
if (rdev->kfd)
kgd2kfd->suspend(rdev->kfd);
if (adev->kfd)
kgd2kfd->suspend(adev->kfd);
}
int amdgpu_amdkfd_resume(struct amdgpu_device *rdev)
int amdgpu_amdkfd_resume(struct amdgpu_device *adev)
{
int r = 0;
if (rdev->kfd)
r = kgd2kfd->resume(rdev->kfd);
if (adev->kfd)
r = kgd2kfd->resume(adev->kfd);
return r;
}
......@@ -147,7 +172,7 @@ int alloc_gtt_mem(struct kgd_dev *kgd, size_t size,
void **mem_obj, uint64_t *gpu_addr,
void **cpu_ptr)
{
struct amdgpu_device *rdev = (struct amdgpu_device *)kgd;
struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
struct kgd_mem **mem = (struct kgd_mem **) mem_obj;
int r;
......@@ -159,10 +184,10 @@ int alloc_gtt_mem(struct kgd_dev *kgd, size_t size,
if ((*mem) == NULL)
return -ENOMEM;
r = amdgpu_bo_create(rdev, size, PAGE_SIZE, true, AMDGPU_GEM_DOMAIN_GTT,
r = amdgpu_bo_create(adev, size, PAGE_SIZE, true, AMDGPU_GEM_DOMAIN_GTT,
AMDGPU_GEM_CREATE_CPU_GTT_USWC, NULL, NULL, &(*mem)->bo);
if (r) {
dev_err(rdev->dev,
dev_err(adev->dev,
"failed to allocate BO for amdkfd (%d)\n", r);
return r;
}
......@@ -170,21 +195,21 @@ int alloc_gtt_mem(struct kgd_dev *kgd, size_t size,
/* map the buffer */
r = amdgpu_bo_reserve((*mem)->bo, true);
if (r) {
dev_err(rdev->dev, "(%d) failed to reserve bo for amdkfd\n", r);
dev_err(adev->dev, "(%d) failed to reserve bo for amdkfd\n", r);
goto allocate_mem_reserve_bo_failed;
}
r = amdgpu_bo_pin((*mem)->bo, AMDGPU_GEM_DOMAIN_GTT,
&(*mem)->gpu_addr);
if (r) {
dev_err(rdev->dev, "(%d) failed to pin bo for amdkfd\n", r);
dev_err(adev->dev, "(%d) failed to pin bo for amdkfd\n", r);
goto allocate_mem_pin_bo_failed;
}
*gpu_addr = (*mem)->gpu_addr;
r = amdgpu_bo_kmap((*mem)->bo, &(*mem)->cpu_ptr);
if (r) {
dev_err(rdev->dev,
dev_err(adev->dev,
"(%d) failed to map bo to kernel for amdkfd\n", r);
goto allocate_mem_kmap_bo_failed;
}
......@@ -220,27 +245,27 @@ void free_gtt_mem(struct kgd_dev *kgd, void *mem_obj)
uint64_t get_vmem_size(struct kgd_dev *kgd)
{
struct amdgpu_device *rdev =
struct amdgpu_device *adev =
(struct amdgpu_device *)kgd;
BUG_ON(kgd == NULL);
return rdev->mc.real_vram_size;
return adev->mc.real_vram_size;
}
uint64_t get_gpu_clock_counter(struct kgd_dev *kgd)
{
struct amdgpu_device *rdev = (struct amdgpu_device *)kgd;
struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
if (rdev->gfx.funcs->get_gpu_clock_counter)
return rdev->gfx.funcs->get_gpu_clock_counter(rdev);
if (adev->gfx.funcs->get_gpu_clock_counter)
return adev->gfx.funcs->get_gpu_clock_counter(adev);
return 0;
}
uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd)
{
struct amdgpu_device *rdev = (struct amdgpu_device *)kgd;
struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
/* The sclk is in quantas of 10kHz */
return rdev->pm.dpm.dyn_state.max_clock_voltage_on_ac.sclk / 100;
return adev->pm.dpm.dyn_state.max_clock_voltage_on_ac.sclk / 100;
}
......@@ -39,15 +39,15 @@ struct kgd_mem {
int amdgpu_amdkfd_init(void);
void amdgpu_amdkfd_fini(void);
bool amdgpu_amdkfd_load_interface(struct amdgpu_device *rdev);
bool amdgpu_amdkfd_load_interface(struct amdgpu_device *adev);
void amdgpu_amdkfd_suspend(struct amdgpu_device *rdev);
int amdgpu_amdkfd_resume(struct amdgpu_device *rdev);
void amdgpu_amdkfd_interrupt(struct amdgpu_device *rdev,
void amdgpu_amdkfd_suspend(struct amdgpu_device *adev);
int amdgpu_amdkfd_resume(struct amdgpu_device *adev);
void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
const void *ih_ring_entry);
void amdgpu_amdkfd_device_probe(struct amdgpu_device *rdev);
void amdgpu_amdkfd_device_init(struct amdgpu_device *rdev);
void amdgpu_amdkfd_device_fini(struct amdgpu_device *rdev);
void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev);
void amdgpu_amdkfd_device_init(struct amdgpu_device *adev);
void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev);
struct kfd2kgd_calls *amdgpu_amdkfd_gfx_7_get_functions(void);
struct kfd2kgd_calls *amdgpu_amdkfd_gfx_8_0_get_functions(void);
......
......@@ -29,6 +29,7 @@
#include "cikd.h"
#include "cik_sdma.h"
#include "amdgpu_ucode.h"
#include "gfx_v7_0.h"
#include "gca/gfx_7_2_d.h"
#include "gca/gfx_7_2_enum.h"
#include "gca/gfx_7_2_sh_mask.h"
......@@ -38,8 +39,6 @@
#include "gmc/gmc_7_1_sh_mask.h"
#include "cik_structs.h"
#define CIK_PIPE_PER_MEC (4)
enum {
MAX_TRAPID = 8, /* 3 bits in the bitfield. */
MAX_WATCH_ADDRESSES = 4
......@@ -185,8 +184,10 @@ static void unlock_srbm(struct kgd_dev *kgd)
static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id,
uint32_t queue_id)
{
uint32_t mec = (++pipe_id / CIK_PIPE_PER_MEC) + 1;
uint32_t pipe = (pipe_id % CIK_PIPE_PER_MEC);
struct amdgpu_device *adev = get_amdgpu_device(kgd);
uint32_t mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
lock_srbm(kgd, mec, pipe, queue_id, 0);
}
......@@ -243,18 +244,7 @@ static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, unsigned int pasid,
static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id,
uint32_t hpd_size, uint64_t hpd_gpu_addr)
{
struct amdgpu_device *adev = get_amdgpu_device(kgd);
uint32_t mec = (++pipe_id / CIK_PIPE_PER_MEC) + 1;
uint32_t pipe = (pipe_id % CIK_PIPE_PER_MEC);
lock_srbm(kgd, mec, pipe, 0, 0);
WREG32(mmCP_HPD_EOP_BASE_ADDR, lower_32_bits(hpd_gpu_addr >> 8));
WREG32(mmCP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(hpd_gpu_addr >> 8));
WREG32(mmCP_HPD_EOP_VMID, 0);
WREG32(mmCP_HPD_EOP_CONTROL, hpd_size);
unlock_srbm(kgd);
/* amdgpu owns the per-pipe state */
return 0;
}
......@@ -264,8 +254,8 @@ static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id)
uint32_t mec;
uint32_t pipe;
mec = (pipe_id / CIK_PIPE_PER_MEC) + 1;
pipe = (pipe_id % CIK_PIPE_PER_MEC);
mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
lock_srbm(kgd, mec, pipe, 0, 0);
......@@ -309,55 +299,11 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
m = get_mqd(mqd);
is_wptr_shadow_valid = !get_user(wptr_shadow, wptr);
acquire_queue(kgd, pipe_id, queue_id);
WREG32(mmCP_MQD_BASE_ADDR, m->cp_mqd_base_addr_lo);
WREG32(mmCP_MQD_BASE_ADDR_HI, m->cp_mqd_base_addr_hi);
WREG32(mmCP_MQD_CONTROL, m->cp_mqd_control);
WREG32(mmCP_HQD_PQ_BASE, m->cp_hqd_pq_base_lo);
WREG32(mmCP_HQD_PQ_BASE_HI, m->cp_hqd_pq_base_hi);
WREG32(mmCP_HQD_PQ_CONTROL, m->cp_hqd_pq_control);
WREG32(mmCP_HQD_IB_CONTROL, m->cp_hqd_ib_control);
WREG32(mmCP_HQD_IB_BASE_ADDR, m->cp_hqd_ib_base_addr_lo);
WREG32(mmCP_HQD_IB_BASE_ADDR_HI, m->cp_hqd_ib_base_addr_hi);
WREG32(mmCP_HQD_IB_RPTR, m->cp_hqd_ib_rptr);
WREG32(mmCP_HQD_PERSISTENT_STATE, m->cp_hqd_persistent_state);
WREG32(mmCP_HQD_SEMA_CMD, m->cp_hqd_sema_cmd);
WREG32(mmCP_HQD_MSG_TYPE, m->cp_hqd_msg_type);
WREG32(mmCP_HQD_ATOMIC0_PREOP_LO, m->cp_hqd_atomic0_preop_lo);
WREG32(mmCP_HQD_ATOMIC0_PREOP_HI, m->cp_hqd_atomic0_preop_hi);
WREG32(mmCP_HQD_ATOMIC1_PREOP_LO, m->cp_hqd_atomic1_preop_lo);
WREG32(mmCP_HQD_ATOMIC1_PREOP_HI, m->cp_hqd_atomic1_preop_hi);
WREG32(mmCP_HQD_PQ_RPTR_REPORT_ADDR, m->cp_hqd_pq_rptr_report_addr_lo);
WREG32(mmCP_HQD_PQ_RPTR_REPORT_ADDR_HI,
m->cp_hqd_pq_rptr_report_addr_hi);
WREG32(mmCP_HQD_PQ_RPTR, m->cp_hqd_pq_rptr);
WREG32(mmCP_HQD_PQ_WPTR_POLL_ADDR, m->cp_hqd_pq_wptr_poll_addr_lo);
WREG32(mmCP_HQD_PQ_WPTR_POLL_ADDR_HI, m->cp_hqd_pq_wptr_poll_addr_hi);
WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, m->cp_hqd_pq_doorbell_control);
WREG32(mmCP_HQD_VMID, m->cp_hqd_vmid);
WREG32(mmCP_HQD_QUANTUM, m->cp_hqd_quantum);
WREG32(mmCP_HQD_PIPE_PRIORITY, m->cp_hqd_pipe_priority);
WREG32(mmCP_HQD_QUEUE_PRIORITY, m->cp_hqd_queue_priority);
WREG32(mmCP_HQD_IQ_RPTR, m->cp_hqd_iq_rptr);
if (is_wptr_shadow_valid)
WREG32(mmCP_HQD_PQ_WPTR, wptr_shadow);
m->cp_hqd_pq_wptr = wptr_shadow;
WREG32(mmCP_HQD_ACTIVE, m->cp_hqd_active);
acquire_queue(kgd, pipe_id, queue_id);
gfx_v7_0_mqd_commit(adev, m);
release_queue(kgd);
return 0;
......
......@@ -28,6 +28,7 @@
#include "amdgpu.h"
#include "amdgpu_amdkfd.h"
#include "amdgpu_ucode.h"
#include "gfx_v8_0.h"
#include "gca/gfx_8_0_sh_mask.h"
#include "gca/gfx_8_0_d.h"
#include "gca/gfx_8_0_enum.h"
......@@ -38,8 +39,6 @@
#include "vi_structs.h"
#include "vid.h"
#define VI_PIPE_PER_MEC (4)
struct cik_sdma_rlc_registers;
/*
......@@ -146,8 +145,10 @@ static void unlock_srbm(struct kgd_dev *kgd)
static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id,
uint32_t queue_id)
{
uint32_t mec = (++pipe_id / VI_PIPE_PER_MEC) + 1;
uint32_t pipe = (pipe_id % VI_PIPE_PER_MEC);
struct amdgpu_device *adev = get_amdgpu_device(kgd);
uint32_t mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
lock_srbm(kgd, mec, pipe, queue_id, 0);
}
......@@ -205,6 +206,7 @@ static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, unsigned int pasid,
static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id,
uint32_t hpd_size, uint64_t hpd_gpu_addr)
{
/* amdgpu owns the per-pipe state */
return 0;
}
......@@ -214,8 +216,8 @@ static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id)
uint32_t mec;
uint32_t pipe;
mec = (++pipe_id / VI_PIPE_PER_MEC) + 1;
pipe = (pipe_id % VI_PIPE_PER_MEC);
mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
lock_srbm(kgd, mec, pipe, 0, 0);
......@@ -251,53 +253,11 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
m = get_mqd(mqd);
valid_wptr = copy_from_user(&shadow_wptr, wptr, sizeof(shadow_wptr));
acquire_queue(kgd, pipe_id, queue_id);
WREG32(mmCP_MQD_CONTROL, m->cp_mqd_control);
WREG32(mmCP_MQD_BASE_ADDR, m->cp_mqd_base_addr_lo);
WREG32(mmCP_MQD_BASE_ADDR_HI, m->cp_mqd_base_addr_hi);
WREG32(mmCP_HQD_VMID, m->cp_hqd_vmid);
WREG32(mmCP_HQD_PERSISTENT_STATE, m->cp_hqd_persistent_state);
WREG32(mmCP_HQD_PIPE_PRIORITY, m->cp_hqd_pipe_priority);
WREG32(mmCP_HQD_QUEUE_PRIORITY, m->cp_hqd_queue_priority);
WREG32(mmCP_HQD_QUANTUM, m->cp_hqd_quantum);
WREG32(mmCP_HQD_PQ_BASE, m->cp_hqd_pq_base_lo);
WREG32(mmCP_HQD_PQ_BASE_HI, m->cp_hqd_pq_base_hi);
WREG32(mmCP_HQD_PQ_RPTR_REPORT_ADDR, m->cp_hqd_pq_rptr_report_addr_lo);
WREG32(mmCP_HQD_PQ_RPTR_REPORT_ADDR_HI,
m->cp_hqd_pq_rptr_report_addr_hi);
if (valid_wptr > 0)
WREG32(mmCP_HQD_PQ_WPTR, shadow_wptr);
WREG32(mmCP_HQD_PQ_CONTROL, m->cp_hqd_pq_control);
WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, m->cp_hqd_pq_doorbell_control);
WREG32(mmCP_HQD_EOP_BASE_ADDR, m->cp_hqd_eop_base_addr_lo);
WREG32(mmCP_HQD_EOP_BASE_ADDR_HI, m->cp_hqd_eop_base_addr_hi);
WREG32(mmCP_HQD_EOP_CONTROL, m->cp_hqd_eop_control);
WREG32(mmCP_HQD_EOP_RPTR, m->cp_hqd_eop_rptr);
WREG32(mmCP_HQD_EOP_WPTR, m->cp_hqd_eop_wptr);
WREG32(mmCP_HQD_EOP_EVENTS, m->cp_hqd_eop_done_events);
WREG32(mmCP_HQD_CTX_SAVE_BASE_ADDR_LO, m->cp_hqd_ctx_save_base_addr_lo);
WREG32(mmCP_HQD_CTX_SAVE_BASE_ADDR_HI, m->cp_hqd_ctx_save_base_addr_hi);
WREG32(mmCP_HQD_CTX_SAVE_CONTROL, m->cp_hqd_ctx_save_control);
WREG32(mmCP_HQD_CNTL_STACK_OFFSET, m->cp_hqd_cntl_stack_offset);
WREG32(mmCP_HQD_CNTL_STACK_SIZE, m->cp_hqd_cntl_stack_size);
WREG32(mmCP_HQD_WG_STATE_OFFSET, m->cp_hqd_wg_state_offset);
WREG32(mmCP_HQD_CTX_SAVE_SIZE, m->cp_hqd_ctx_save_size);
WREG32(mmCP_HQD_IB_CONTROL, m->cp_hqd_ib_control);
WREG32(mmCP_HQD_DEQUEUE_REQUEST, m->cp_hqd_dequeue_request);
WREG32(mmCP_HQD_ERROR, m->cp_hqd_error);
WREG32(mmCP_HQD_EOP_WPTR_MEM, m->cp_hqd_eop_wptr_mem);
WREG32(mmCP_HQD_EOP_DONES, m->cp_hqd_eop_dones);
WREG32(mmCP_HQD_ACTIVE, m->cp_hqd_active);
if (valid_wptr == 0)
m->cp_hqd_pq_wptr = shadow_wptr;
acquire_queue(kgd, pipe_id, queue_id);
gfx_v8_0_mqd_commit(adev, mqd);
release_queue(kgd);
return 0;
......
......@@ -35,33 +35,59 @@
#define AMDGPU_BO_LIST_MAX_PRIORITY 32u
#define AMDGPU_BO_LIST_NUM_BUCKETS (AMDGPU_BO_LIST_MAX_PRIORITY + 1)
static int amdgpu_bo_list_create(struct amdgpu_fpriv *fpriv,
struct amdgpu_bo_list **result,
static int amdgpu_bo_list_set(struct amdgpu_device *adev,
struct drm_file *filp,
struct amdgpu_bo_list *list,
struct drm_amdgpu_bo_list_entry *info,
unsigned num_entries);
static void amdgpu_bo_list_release_rcu(struct kref *ref)
{
unsigned i;
struct amdgpu_bo_list *list = container_of(ref, struct amdgpu_bo_list,
refcount);
for (i = 0; i < list->num_entries; ++i)
amdgpu_bo_unref(&list->array[i].robj);
mutex_destroy(&list->lock);
kvfree(list->array);
kfree_rcu(list, rhead);
}
static int amdgpu_bo_list_create(struct amdgpu_device *adev,
struct drm_file *filp,
struct drm_amdgpu_bo_list_entry *info,
unsigned num_entries,
int *id)
{
int r;
struct amdgpu_fpriv *fpriv = filp->driver_priv;
struct amdgpu_bo_list *list;
*result = kzalloc(sizeof(struct amdgpu_bo_list), GFP_KERNEL);
if (!*result)
list = kzalloc(sizeof(struct amdgpu_bo_list), GFP_KERNEL);
if (!list)
return -ENOMEM;
/* initialize bo list*/
mutex_init(&list->lock);
kref_init(&list->refcount);
r = amdgpu_bo_list_set(adev, filp, list, info, num_entries);
if (r) {
kfree(list);
return r;
}
/* idr alloc should be called only after initialization of bo list. */
mutex_lock(&fpriv->bo_list_lock);
r = idr_alloc(&fpriv->bo_list_handles, *result,
1, 0, GFP_KERNEL);
r = idr_alloc(&fpriv->bo_list_handles, list, 1, 0, GFP_KERNEL);
mutex_unlock(&fpriv->bo_list_lock);
if (r < 0) {
mutex_unlock(&fpriv->bo_list_lock);
kfree(*result);
kfree(list);
return r;
}
*id = r;
mutex_init(&(*result)->lock);
(*result)->num_entries = 0;
(*result)->array = NULL;
mutex_lock(&(*result)->lock);
mutex_unlock(&fpriv->bo_list_lock);
return 0;
}
......@@ -71,13 +97,9 @@ static void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id)
mutex_lock(&fpriv->bo_list_lock);
list = idr_remove(&fpriv->bo_list_handles, id);
if (list) {
/* Another user may have a reference to this list still */
mutex_lock(&list->lock);
mutex_unlock(&list->lock);
amdgpu_bo_list_free(list);
}
mutex_unlock(&fpriv->bo_list_lock);
if (list)
kref_put(&list->refcount, amdgpu_bo_list_release_rcu);
}
static int amdgpu_bo_list_set(struct amdgpu_device *adev,
......@@ -96,7 +118,7 @@ static int amdgpu_bo_list_set(struct amdgpu_device *adev,
int r;
unsigned long total_size = 0;
array = drm_malloc_ab(num_entries, sizeof(struct amdgpu_bo_list_entry));
array = kvmalloc_array(num_entries, sizeof(struct amdgpu_bo_list_entry), GFP_KERNEL);
if (!array)
return -ENOMEM;
memset(array, 0, num_entries * sizeof(struct amdgpu_bo_list_entry));
......@@ -148,7 +170,7 @@ static int amdgpu_bo_list_set(struct amdgpu_device *adev,
for (i = 0; i < list->num_entries; ++i)
amdgpu_bo_unref(&list->array[i].robj);
drm_free_large(list->array);
kvfree(list->array);
list->gds_obj = gds_obj;
list->gws_obj = gws_obj;
......@@ -163,7 +185,7 @@ static int amdgpu_bo_list_set(struct amdgpu_device *adev,
error_free:
while (i--)
amdgpu_bo_unref(&array[i].robj);
drm_free_large(array);
kvfree(array);
return r;
}
......@@ -172,11 +194,17 @@ amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id)
{
struct amdgpu_bo_list *result;
mutex_lock(&fpriv->bo_list_lock);
rcu_read_lock();
result = idr_find(&fpriv->bo_list_handles, id);
if (result)
mutex_lock(&result->lock);
mutex_unlock(&fpriv->bo_list_lock);
if (result) {
if (kref_get_unless_zero(&result->refcount))
mutex_lock(&result->lock);
else
result = NULL;
}
rcu_read_unlock();
return result;
}
......@@ -214,6 +242,7 @@ void amdgpu_bo_list_get_list(struct amdgpu_bo_list *list,
void amdgpu_bo_list_put(struct amdgpu_bo_list *list)
{
mutex_unlock(&list->lock);
kref_put(&list->refcount, amdgpu_bo_list_release_rcu);
}
void amdgpu_bo_list_free(struct amdgpu_bo_list *list)
......@@ -224,7 +253,7 @@ void amdgpu_bo_list_free(struct amdgpu_bo_list *list)
amdgpu_bo_unref(&list->array[i].robj);
mutex_destroy(&list->lock);
drm_free_large(list->array);
kvfree(list->array);
kfree(list);
}
......@@ -244,8 +273,8 @@ int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data,
int r;
info = drm_malloc_ab(args->in.bo_number,
sizeof(struct drm_amdgpu_bo_list_entry));
info = kvmalloc_array(args->in.bo_number,
sizeof(struct drm_amdgpu_bo_list_entry), GFP_KERNEL);
if (!info)
return -ENOMEM;
......@@ -273,16 +302,10 @@ int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data,
switch (args->in.operation) {
case AMDGPU_BO_LIST_OP_CREATE:
r = amdgpu_bo_list_create(fpriv, &list, &handle);
r = amdgpu_bo_list_create(adev, filp, info, args->in.bo_number,
&handle);
if (r)
goto error_free;
r = amdgpu_bo_list_set(adev, filp, list, info,
args->in.bo_number);
amdgpu_bo_list_put(list);
if (r)
goto error_free;
break;
case AMDGPU_BO_LIST_OP_DESTROY:
......@@ -311,11 +334,11 @@ int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data,
memset(args, 0, sizeof(*args));
args->out.list_handle = handle;
drm_free_large(info);
kvfree(info);
return 0;
error_free:
drm_free_large(info);
kvfree(info);
return r;
}
......@@ -27,81 +27,10 @@
#include <linux/pagemap.h>
#include <drm/drmP.h>
#include <drm/amdgpu_drm.h>
#include <drm/drm_syncobj.h>
#include "amdgpu.h"
#include "amdgpu_trace.h"
int amdgpu_cs_get_ring(struct amdgpu_device *adev, u32 ip_type,
u32 ip_instance, u32 ring,
struct amdgpu_ring **out_ring)
{
/* Right now all IPs have only one instance - multiple rings. */
if (ip_instance != 0) {
DRM_ERROR("invalid ip instance: %d\n", ip_instance);
return -EINVAL;
}
switch (ip_type) {
default:
DRM_ERROR("unknown ip type: %d\n", ip_type);
return -EINVAL;
case AMDGPU_HW_IP_GFX:
if (ring < adev->gfx.num_gfx_rings) {
*out_ring = &adev->gfx.gfx_ring[ring];
} else {
DRM_ERROR("only %d gfx rings are supported now\n",
adev->gfx.num_gfx_rings);
return -EINVAL;
}
break;
case AMDGPU_HW_IP_COMPUTE:
if (ring < adev->gfx.num_compute_rings) {
*out_ring = &adev->gfx.compute_ring[ring];
} else {
DRM_ERROR("only %d compute rings are supported now\n",
adev->gfx.num_compute_rings);
return -EINVAL;
}
break;
case AMDGPU_HW_IP_DMA:
if (ring < adev->sdma.num_instances) {
*out_ring = &adev->sdma.instance[ring].ring;
} else {
DRM_ERROR("only %d SDMA rings are supported\n",
adev->sdma.num_instances);
return -EINVAL;
}
break;
case AMDGPU_HW_IP_UVD:
*out_ring = &adev->uvd.ring;
break;
case AMDGPU_HW_IP_VCE:
if (ring < adev->vce.num_rings){
*out_ring = &adev->vce.ring[ring];
} else {
DRM_ERROR("only %d VCE rings are supported\n", adev->vce.num_rings);
return -EINVAL;
}
break;
case AMDGPU_HW_IP_UVD_ENC:
if (ring < adev->uvd.num_enc_rings){
*out_ring = &adev->uvd.ring_enc[ring];
} else {
DRM_ERROR("only %d UVD ENC rings are supported\n",
adev->uvd.num_enc_rings);
return -EINVAL;
}
break;
}
if (!(*out_ring && (*out_ring)->adev)) {
DRM_ERROR("Ring %d is not initialized on IP %d\n",
ring, ip_type);
return -EINVAL;
}
return 0;
}
static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
struct drm_amdgpu_cs_chunk_fence *data,
uint32_t *offset)
......@@ -194,7 +123,7 @@ int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, void *data)
size = p->chunks[i].length_dw;
cdata = (void __user *)(uintptr_t)user_chunk.chunk_data;
p->chunks[i].kdata = drm_malloc_ab(size, sizeof(uint32_t));
p->chunks[i].kdata = kvmalloc_array(size, sizeof(uint32_t), GFP_KERNEL);
if (p->chunks[i].kdata == NULL) {
ret = -ENOMEM;
i--;
......@@ -226,6 +155,8 @@ int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, void *data)
break;
case AMDGPU_CHUNK_ID_DEPENDENCIES:
case AMDGPU_CHUNK_ID_SYNCOBJ_IN:
case AMDGPU_CHUNK_ID_SYNCOBJ_OUT:
break;
default:
......@@ -247,7 +178,7 @@ int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, void *data)
i = p->nchunks - 1;
free_partial_kdata:
for (; i >= 0; i--)
drm_free_large(p->chunks[i].kdata);
kvfree(p->chunks[i].kdata);
kfree(p->chunks);
p->chunks = NULL;
p->nchunks = 0;
......@@ -505,7 +436,7 @@ static int amdgpu_cs_list_validate(struct amdgpu_cs_parser *p,
return r;
if (binding_userptr) {
drm_free_large(lobj->user_pages);
kvfree(lobj->user_pages);
lobj->user_pages = NULL;
}
}
......@@ -571,7 +502,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
release_pages(e->user_pages,
e->robj->tbo.ttm->num_pages,
false);
drm_free_large(e->user_pages);
kvfree(e->user_pages);
e->user_pages = NULL;
}
......@@ -597,12 +528,13 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
goto error_free_pages;
}
/* Fill the page arrays for all useptrs. */
/* Fill the page arrays for all userptrs. */
list_for_each_entry(e, &need_pages, tv.head) {
struct ttm_tt *ttm = e->robj->tbo.ttm;
e->user_pages = drm_calloc_large(ttm->num_pages,
sizeof(struct page*));
e->user_pages = kvmalloc_array(ttm->num_pages,
sizeof(struct page*),
GFP_KERNEL | __GFP_ZERO);
if (!e->user_pages) {
r = -ENOMEM;
DRM_ERROR("calloc failure in %s\n", __func__);
......@@ -612,7 +544,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
r = amdgpu_ttm_tt_get_user_pages(ttm, e->user_pages);
if (r) {
DRM_ERROR("amdgpu_ttm_tt_get_user_pages failed.\n");
drm_free_large(e->user_pages);
kvfree(e->user_pages);
e->user_pages = NULL;
goto error_free_pages;
}
......@@ -708,7 +640,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
release_pages(e->user_pages,
e->robj->tbo.ttm->num_pages,
false);
drm_free_large(e->user_pages);
kvfree(e->user_pages);
}
}
......@@ -753,6 +685,11 @@ static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser, int error, bo
ttm_eu_backoff_reservation(&parser->ticket,
&parser->validated);
}
for (i = 0; i < parser->num_post_dep_syncobjs; i++)
drm_syncobj_put(parser->post_dep_syncobjs[i]);
kfree(parser->post_dep_syncobjs);
dma_fence_put(parser->fence);
if (parser->ctx)
......@@ -761,7 +698,7 @@ static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser, int error, bo
amdgpu_bo_list_put(parser->bo_list);
for (i = 0; i < parser->nchunks; i++)
drm_free_large(parser->chunks[i].kdata);
kvfree(parser->chunks[i].kdata);
kfree(parser->chunks);
if (parser->job)
amdgpu_job_free(parser->job);
......@@ -916,9 +853,8 @@ static int amdgpu_cs_ib_fill(struct amdgpu_device *adev,
return -EINVAL;
}
r = amdgpu_cs_get_ring(adev, chunk_ib->ip_type,
chunk_ib->ip_instance, chunk_ib->ring,
&ring);
r = amdgpu_queue_mgr_map(adev, &parser->ctx->queue_mgr, chunk_ib->ip_type,
chunk_ib->ip_instance, chunk_ib->ring, &ring);
if (r)
return r;
......@@ -995,62 +931,150 @@ static int amdgpu_cs_ib_fill(struct amdgpu_device *adev,
return 0;
}
static int amdgpu_cs_dependencies(struct amdgpu_device *adev,
struct amdgpu_cs_parser *p)
static int amdgpu_cs_process_fence_dep(struct amdgpu_cs_parser *p,
struct amdgpu_cs_chunk *chunk)
{
struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
int i, j, r;
for (i = 0; i < p->nchunks; ++i) {
struct drm_amdgpu_cs_chunk_dep *deps;
struct amdgpu_cs_chunk *chunk;
unsigned num_deps;
unsigned num_deps;
int i, r;
struct drm_amdgpu_cs_chunk_dep *deps;
chunk = &p->chunks[i];
deps = (struct drm_amdgpu_cs_chunk_dep *)chunk->kdata;
num_deps = chunk->length_dw * 4 /
sizeof(struct drm_amdgpu_cs_chunk_dep);
if (chunk->chunk_id != AMDGPU_CHUNK_ID_DEPENDENCIES)
continue;
for (i = 0; i < num_deps; ++i) {
struct amdgpu_ring *ring;
struct amdgpu_ctx *ctx;
struct dma_fence *fence;
deps = (struct drm_amdgpu_cs_chunk_dep *)chunk->kdata;
num_deps = chunk->length_dw * 4 /
sizeof(struct drm_amdgpu_cs_chunk_dep);
ctx = amdgpu_ctx_get(fpriv, deps[i].ctx_id);
if (ctx == NULL)
return -EINVAL;
for (j = 0; j < num_deps; ++j) {
struct amdgpu_ring *ring;
struct amdgpu_ctx *ctx;
struct dma_fence *fence;
r = amdgpu_queue_mgr_map(p->adev, &ctx->queue_mgr,
deps[i].ip_type,
deps[i].ip_instance,
deps[i].ring, &ring);
if (r) {
amdgpu_ctx_put(ctx);
return r;
}
r = amdgpu_cs_get_ring(adev, deps[j].ip_type,
deps[j].ip_instance,
deps[j].ring, &ring);
fence = amdgpu_ctx_get_fence(ctx, ring,
deps[i].handle);
if (IS_ERR(fence)) {
r = PTR_ERR(fence);
amdgpu_ctx_put(ctx);
return r;
} else if (fence) {
r = amdgpu_sync_fence(p->adev, &p->job->sync,
fence);
dma_fence_put(fence);
amdgpu_ctx_put(ctx);
if (r)
return r;
}
}
return 0;
}
ctx = amdgpu_ctx_get(fpriv, deps[j].ctx_id);
if (ctx == NULL)
return -EINVAL;
static int amdgpu_syncobj_lookup_and_add_to_sync(struct amdgpu_cs_parser *p,
uint32_t handle)
{
int r;
struct dma_fence *fence;
r = drm_syncobj_fence_get(p->filp, handle, &fence);
if (r)
return r;
fence = amdgpu_ctx_get_fence(ctx, ring,
deps[j].handle);
if (IS_ERR(fence)) {
r = PTR_ERR(fence);
amdgpu_ctx_put(ctx);
return r;
r = amdgpu_sync_fence(p->adev, &p->job->sync, fence);
dma_fence_put(fence);
} else if (fence) {
r = amdgpu_sync_fence(adev, &p->job->sync,
fence);
dma_fence_put(fence);
amdgpu_ctx_put(ctx);
if (r)
return r;
}
return r;
}
static int amdgpu_cs_process_syncobj_in_dep(struct amdgpu_cs_parser *p,
struct amdgpu_cs_chunk *chunk)
{
unsigned num_deps;
int i, r;
struct drm_amdgpu_cs_chunk_sem *deps;
deps = (struct drm_amdgpu_cs_chunk_sem *)chunk->kdata;
num_deps = chunk->length_dw * 4 /
sizeof(struct drm_amdgpu_cs_chunk_sem);
for (i = 0; i < num_deps; ++i) {
r = amdgpu_syncobj_lookup_and_add_to_sync(p, deps[i].handle);
if (r)
return r;
}
return 0;
}
static int amdgpu_cs_process_syncobj_out_dep(struct amdgpu_cs_parser *p,
struct amdgpu_cs_chunk *chunk)
{
unsigned num_deps;
int i;
struct drm_amdgpu_cs_chunk_sem *deps;
deps = (struct drm_amdgpu_cs_chunk_sem *)chunk->kdata;
num_deps = chunk->length_dw * 4 /
sizeof(struct drm_amdgpu_cs_chunk_sem);
p->post_dep_syncobjs = kmalloc_array(num_deps,
sizeof(struct drm_syncobj *),
GFP_KERNEL);
p->num_post_dep_syncobjs = 0;
for (i = 0; i < num_deps; ++i) {
p->post_dep_syncobjs[i] = drm_syncobj_find(p->filp, deps[i].handle);
if (!p->post_dep_syncobjs[i])
return -EINVAL;
p->num_post_dep_syncobjs++;
}
return 0;
}
static int amdgpu_cs_dependencies(struct amdgpu_device *adev,
struct amdgpu_cs_parser *p)
{
int i, r;
for (i = 0; i < p->nchunks; ++i) {
struct amdgpu_cs_chunk *chunk;
chunk = &p->chunks[i];
if (chunk->chunk_id == AMDGPU_CHUNK_ID_DEPENDENCIES) {
r = amdgpu_cs_process_fence_dep(p, chunk);
if (r)
return r;
} else if (chunk->chunk_id == AMDGPU_CHUNK_ID_SYNCOBJ_IN) {
r = amdgpu_cs_process_syncobj_in_dep(p, chunk);
if (r)
return r;
} else if (chunk->chunk_id == AMDGPU_CHUNK_ID_SYNCOBJ_OUT) {
r = amdgpu_cs_process_syncobj_out_dep(p, chunk);
if (r)
return r;
}
}
return 0;
}
static void amdgpu_cs_post_dependencies(struct amdgpu_cs_parser *p)
{
int i;
for (i = 0; i < p->num_post_dep_syncobjs; ++i) {
drm_syncobj_replace_fence(p->filp, p->post_dep_syncobjs[i],
p->fence);
}
}
static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
union drm_amdgpu_cs *cs)
{
......@@ -1071,6 +1095,9 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
job->owner = p->filp;
job->fence_ctx = entity->fence_context;
p->fence = dma_fence_get(&job->base.s_fence->finished);
amdgpu_cs_post_dependencies(p);
cs->out.handle = amdgpu_ctx_add_fence(p->ctx, ring, p->fence);
job->uf_sequence = cs->out.handle;
amdgpu_job_free_resources(job);
......@@ -1078,13 +1105,13 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
trace_amdgpu_cs_ioctl(job);
amd_sched_entity_push_job(&job->base);
return 0;
}
int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
{
struct amdgpu_device *adev = dev->dev_private;
struct amdgpu_fpriv *fpriv = filp->driver_priv;
union drm_amdgpu_cs *cs = data;
struct amdgpu_cs_parser parser = {};
bool reserved_buffers = false;
......@@ -1092,6 +1119,8 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
if (!adev->accel_working)
return -EBUSY;
if (amdgpu_kms_vram_lost(adev, fpriv))
return -ENODEV;
parser.adev = adev;
parser.filp = filp;
......@@ -1153,21 +1182,28 @@ int amdgpu_cs_wait_ioctl(struct drm_device *dev, void *data,
{
union drm_amdgpu_wait_cs *wait = data;
struct amdgpu_device *adev = dev->dev_private;
struct amdgpu_fpriv *fpriv = filp->driver_priv;
unsigned long timeout = amdgpu_gem_timeout(wait->in.timeout);
struct amdgpu_ring *ring = NULL;
struct amdgpu_ctx *ctx;
struct dma_fence *fence;
long r;
r = amdgpu_cs_get_ring(adev, wait->in.ip_type, wait->in.ip_instance,
wait->in.ring, &ring);
if (r)
return r;
if (amdgpu_kms_vram_lost(adev, fpriv))
return -ENODEV;
ctx = amdgpu_ctx_get(filp->driver_priv, wait->in.ctx_id);
if (ctx == NULL)
return -EINVAL;
r = amdgpu_queue_mgr_map(adev, &ctx->queue_mgr,
wait->in.ip_type, wait->in.ip_instance,
wait->in.ring, &ring);
if (r) {
amdgpu_ctx_put(ctx);
return r;
}
fence = amdgpu_ctx_get_fence(ctx, ring, wait->in.handle);
if (IS_ERR(fence))
r = PTR_ERR(fence);
......@@ -1203,15 +1239,17 @@ static struct dma_fence *amdgpu_cs_get_fence(struct amdgpu_device *adev,
struct dma_fence *fence;
int r;
r = amdgpu_cs_get_ring(adev, user->ip_type, user->ip_instance,
user->ring, &ring);
if (r)
return ERR_PTR(r);
ctx = amdgpu_ctx_get(filp->driver_priv, user->ctx_id);
if (ctx == NULL)
return ERR_PTR(-EINVAL);
r = amdgpu_queue_mgr_map(adev, &ctx->queue_mgr, user->ip_type,
user->ip_instance, user->ring, &ring);
if (r) {
amdgpu_ctx_put(ctx);
return ERR_PTR(r);
}
fence = amdgpu_ctx_get_fence(ctx, ring, user->seq_no);
amdgpu_ctx_put(ctx);
......@@ -1332,12 +1370,15 @@ int amdgpu_cs_wait_fences_ioctl(struct drm_device *dev, void *data,
struct drm_file *filp)
{
struct amdgpu_device *adev = dev->dev_private;
struct amdgpu_fpriv *fpriv = filp->driver_priv;
union drm_amdgpu_wait_fences *wait = data;
uint32_t fence_count = wait->in.fence_count;
struct drm_amdgpu_fence *fences_user;
struct drm_amdgpu_fence *fences;
int r;
if (amdgpu_kms_vram_lost(adev, fpriv))
return -ENODEV;
/* Get the fences from userspace */
fences = kmalloc_array(fence_count, sizeof(struct drm_amdgpu_fence),
GFP_KERNEL);
......
......@@ -52,12 +52,20 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev, struct amdgpu_ctx *ctx)
struct amd_sched_rq *rq;
rq = &ring->sched.sched_rq[AMD_SCHED_PRIORITY_NORMAL];
if (ring == &adev->gfx.kiq.ring)
continue;
r = amd_sched_entity_init(&ring->sched, &ctx->rings[i].entity,
rq, amdgpu_sched_jobs);
if (r)
goto failed;
}
r = amdgpu_queue_mgr_init(adev, &ctx->queue_mgr);
if (r)
goto failed;
return 0;
failed:
......@@ -86,6 +94,8 @@ static void amdgpu_ctx_fini(struct amdgpu_ctx *ctx)
for (i = 0; i < adev->num_rings; i++)
amd_sched_entity_fini(&adev->rings[i]->sched,
&ctx->rings[i].entity);
amdgpu_queue_mgr_fini(adev, &ctx->queue_mgr);
}
static int amdgpu_ctx_alloc(struct amdgpu_device *adev,
......
......@@ -22,7 +22,7 @@
* Authors: Alex Deucher
*/
#include "drmP.h"
#include <drm/drmP.h>
#include "amdgpu.h"
#include "amdgpu_atombios.h"
#include "amdgpu_i2c.h"
......
......@@ -39,7 +39,7 @@
#include <linux/module.h>
#include <linux/pm_runtime.h>
#include <linux/vga_switcheroo.h>
#include "drm_crtc_helper.h"
#include <drm/drm_crtc_helper.h>
#include "amdgpu.h"
#include "amdgpu_irq.h"
......@@ -65,9 +65,11 @@
* - 3.13.0 - Add PRT support
* - 3.14.0 - Fix race in amdgpu_ctx_get_fence() and note new functionality
* - 3.15.0 - Export more gpu info for gfx9
* - 3.16.0 - Add reserved vmid support
* - 3.17.0 - Add AMDGPU_NUM_VRAM_CPU_PAGE_FAULTS.
*/
#define KMS_DRIVER_MAJOR 3
#define KMS_DRIVER_MINOR 15
#define KMS_DRIVER_MINOR 17
#define KMS_DRIVER_PATCHLEVEL 0
int amdgpu_vram_limit = 0;
......@@ -92,7 +94,8 @@ int amdgpu_vm_size = -1;
int amdgpu_vm_block_size = -1;
int amdgpu_vm_fault_stop = 0;
int amdgpu_vm_debug = 0;
int amdgpu_vram_page_split = 1024;
int amdgpu_vram_page_split = 512;
int amdgpu_vm_update_mode = -1;
int amdgpu_exp_hw_support = 0;
int amdgpu_sched_jobs = 32;
int amdgpu_sched_hw_submission = 2;
......@@ -110,6 +113,8 @@ int amdgpu_prim_buf_per_se = 0;
int amdgpu_pos_buf_per_se = 0;
int amdgpu_cntl_sb_buf_per_se = 0;
int amdgpu_param_buf_per_se = 0;
int amdgpu_job_hang_limit = 0;
int amdgpu_lbpw = -1;
MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in megabytes");
module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
......@@ -177,6 +182,9 @@ module_param_named(vm_fault_stop, amdgpu_vm_fault_stop, int, 0444);
MODULE_PARM_DESC(vm_debug, "Debug VM handling (0 = disabled (default), 1 = enabled)");
module_param_named(vm_debug, amdgpu_vm_debug, int, 0644);
MODULE_PARM_DESC(vm_update_mode, "VM update using CPU (0 = never (default except for large BAR(LB)), 1 = Graphics only, 2 = Compute only (default for LB), 3 = Both");
module_param_named(vm_update_mode, amdgpu_vm_update_mode, int, 0444);
MODULE_PARM_DESC(vram_page_split, "Number of pages after we split VRAM allocations (default 1024, -1 = disable)");
module_param_named(vram_page_split, amdgpu_vram_page_split, int, 0444);
......@@ -232,6 +240,24 @@ module_param_named(cntl_sb_buf_per_se, amdgpu_cntl_sb_buf_per_se, int, 0444);
MODULE_PARM_DESC(param_buf_per_se, "the size of Off-Chip Pramater Cache per Shader Engine (default depending on gfx)");
module_param_named(param_buf_per_se, amdgpu_param_buf_per_se, int, 0444);
MODULE_PARM_DESC(job_hang_limit, "how much time allow a job hang and not drop it (default 0)");
module_param_named(job_hang_limit, amdgpu_job_hang_limit, int ,0444);
MODULE_PARM_DESC(lbpw, "Load Balancing Per Watt (LBPW) support (1 = enable, 0 = disable, -1 = auto)");
module_param_named(lbpw, amdgpu_lbpw, int, 0444);
#ifdef CONFIG_DRM_AMDGPU_SI
int amdgpu_si_support = 0;
MODULE_PARM_DESC(si_support, "SI support (1 = enabled, 0 = disabled (default))");
module_param_named(si_support, amdgpu_si_support, int, 0444);
#endif
#ifdef CONFIG_DRM_AMDGPU_CIK
int amdgpu_cik_support = 0;
MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled, 0 = disabled (default))");
module_param_named(cik_support, amdgpu_cik_support, int, 0444);
#endif
static const struct pci_device_id pciidlist[] = {
#ifdef CONFIG_DRM_AMDGPU_SI
......@@ -460,6 +486,9 @@ static const struct pci_device_id pciidlist[] = {
{0x1002, 0x6868, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10|AMD_EXP_HW_SUPPORT},
{0x1002, 0x686c, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10|AMD_EXP_HW_SUPPORT},
{0x1002, 0x687f, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10|AMD_EXP_HW_SUPPORT},
/* Raven */
{0x1002, 0x15dd, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RAVEN|AMD_IS_APU|AMD_EXP_HW_SUPPORT},
{0, 0, 0}
};
......@@ -491,6 +520,7 @@ static int amdgpu_kick_out_firmware_fb(struct pci_dev *pdev)
static int amdgpu_pci_probe(struct pci_dev *pdev,
const struct pci_device_id *ent)
{
struct drm_device *dev;
unsigned long flags = ent->driver_data;
int ret;
......@@ -513,7 +543,29 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
if (ret)
return ret;
return drm_get_pci_dev(pdev, ent, &kms_driver);
dev = drm_dev_alloc(&kms_driver, &pdev->dev);
if (IS_ERR(dev))
return PTR_ERR(dev);
ret = pci_enable_device(pdev);
if (ret)
goto err_free;
dev->pdev = pdev;
pci_set_drvdata(pdev, dev);
ret = drm_dev_register(dev, ent->driver_data);
if (ret)
goto err_pci;
return 0;
err_pci:
pci_disable_device(pdev);
err_free:
drm_dev_unref(dev);
return ret;
}
static void
......@@ -521,7 +573,8 @@ amdgpu_pci_remove(struct pci_dev *pdev)
{
struct drm_device *dev = pci_get_drvdata(pdev);
drm_put_dev(dev);
drm_dev_unregister(dev);
drm_dev_unref(dev);
}
static void
......@@ -715,11 +768,21 @@ static const struct file_operations amdgpu_driver_kms_fops = {
#endif
};
static bool
amdgpu_get_crtc_scanout_position(struct drm_device *dev, unsigned int pipe,
bool in_vblank_irq, int *vpos, int *hpos,
ktime_t *stime, ktime_t *etime,
const struct drm_display_mode *mode)
{
return amdgpu_get_crtc_scanoutpos(dev, pipe, 0, vpos, hpos,
stime, etime, mode);
}
static struct drm_driver kms_driver = {
.driver_features =
DRIVER_USE_AGP |
DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_GEM |
DRIVER_PRIME | DRIVER_RENDER | DRIVER_MODESET,
DRIVER_PRIME | DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ,
.load = amdgpu_driver_load_kms,
.open = amdgpu_driver_open_kms,
.postclose = amdgpu_driver_postclose_kms,
......@@ -729,8 +792,8 @@ static struct drm_driver kms_driver = {
.get_vblank_counter = amdgpu_get_vblank_counter_kms,
.enable_vblank = amdgpu_enable_vblank_kms,
.disable_vblank = amdgpu_disable_vblank_kms,
.get_vblank_timestamp = amdgpu_get_vblank_timestamp_kms,
.get_scanout_position = amdgpu_get_crtc_scanoutpos,
.get_vblank_timestamp = drm_calc_vbltimestamp_from_scanoutpos,
.get_scanout_position = amdgpu_get_crtc_scanout_position,
#if defined(CONFIG_DEBUG_FS)
.debugfs_init = amdgpu_debugfs_init,
#endif
......@@ -807,7 +870,7 @@ static int __init amdgpu_init(void)
driver->num_ioctls = amdgpu_max_kms_ioctl;
amdgpu_register_atpx_handler();
/* let modprobe override vga console setting */
return drm_pci_init(driver, pdriver);
return pci_register_driver(pdriver);
error_sched:
amdgpu_fence_slab_fini();
......@@ -822,7 +885,7 @@ static int __init amdgpu_init(void)
static void __exit amdgpu_exit(void)
{
amdgpu_amdkfd_fini();
drm_pci_exit(driver, pdriver);
pci_unregister_driver(pdriver);
amdgpu_unregister_atpx_handler();
amdgpu_sync_fini();
amd_sched_fence_slab_fini();
......
......@@ -541,6 +541,12 @@ void amdgpu_fence_driver_force_completion(struct amdgpu_device *adev)
}
}
void amdgpu_fence_driver_force_completion_ring(struct amdgpu_ring *ring)
{
if (ring)
amdgpu_fence_write(ring, ring->fence_drv.sync_seq);
}
/*
* Common fence implementation
*/
......@@ -660,11 +666,17 @@ static const struct drm_info_list amdgpu_debugfs_fence_list[] = {
{"amdgpu_fence_info", &amdgpu_debugfs_fence_info, 0, NULL},
{"amdgpu_gpu_reset", &amdgpu_debugfs_gpu_reset, 0, NULL}
};
static const struct drm_info_list amdgpu_debugfs_fence_list_sriov[] = {
{"amdgpu_fence_info", &amdgpu_debugfs_fence_info, 0, NULL},
};
#endif
int amdgpu_debugfs_fence_init(struct amdgpu_device *adev)
{
#if defined(CONFIG_DEBUG_FS)
if (amdgpu_sriov_vf(adev))
return amdgpu_debugfs_add_files(adev, amdgpu_debugfs_fence_list_sriov, 1);
return amdgpu_debugfs_add_files(adev, amdgpu_debugfs_fence_list, 2);
#else
return 0;
......
......@@ -224,8 +224,9 @@ void amdgpu_gart_table_vram_free(struct amdgpu_device *adev)
*
* Unbinds the requested pages from the gart page table and
* replaces them with the dummy page (all asics).
* Returns 0 for success, -EINVAL for failure.
*/
void amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
int pages)
{
unsigned t;
......@@ -237,7 +238,7 @@ void amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
if (!adev->gart.ready) {
WARN(1, "trying to unbind memory from uninitialized GART !\n");
return;
return -EINVAL;
}
t = offset / AMDGPU_GPU_PAGE_SIZE;
......@@ -258,6 +259,7 @@ void amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
}
mb();
amdgpu_gart_flush_gpu_tlb(adev, 0);
return 0;
}
/**
......
......@@ -219,16 +219,6 @@ void amdgpu_gem_object_close(struct drm_gem_object *obj,
ttm_eu_backoff_reservation(&ticket, &list);
}
static int amdgpu_gem_handle_lockup(struct amdgpu_device *adev, int r)
{
if (r == -EDEADLK) {
r = amdgpu_gpu_reset(adev);
if (!r)
r = -EAGAIN;
}
return r;
}
/*
* GEM ioctls.
*/
......@@ -249,20 +239,17 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data,
AMDGPU_GEM_CREATE_CPU_GTT_USWC |
AMDGPU_GEM_CREATE_VRAM_CLEARED|
AMDGPU_GEM_CREATE_SHADOW |
AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS)) {
r = -EINVAL;
goto error_unlock;
}
AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS))
return -EINVAL;
/* reject invalid gem domains */
if (args->in.domains & ~(AMDGPU_GEM_DOMAIN_CPU |
AMDGPU_GEM_DOMAIN_GTT |
AMDGPU_GEM_DOMAIN_VRAM |
AMDGPU_GEM_DOMAIN_GDS |
AMDGPU_GEM_DOMAIN_GWS |
AMDGPU_GEM_DOMAIN_OA)) {
r = -EINVAL;
goto error_unlock;
}
AMDGPU_GEM_DOMAIN_OA))
return -EINVAL;
/* create a gem object to contain this object in */
if (args->in.domains & (AMDGPU_GEM_DOMAIN_GDS |
......@@ -274,10 +261,8 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data,
size = size << AMDGPU_GWS_SHIFT;
else if (args->in.domains == AMDGPU_GEM_DOMAIN_OA)
size = size << AMDGPU_OA_SHIFT;
else {
r = -EINVAL;
goto error_unlock;
}
else
return -EINVAL;
}
size = roundup(size, PAGE_SIZE);
......@@ -286,21 +271,17 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data,
args->in.domain_flags,
kernel, &gobj);
if (r)
goto error_unlock;
return r;
r = drm_gem_handle_create(filp, gobj, &handle);
/* drop reference from allocate - handle holds it now */
drm_gem_object_unreference_unlocked(gobj);
if (r)
goto error_unlock;
return r;
memset(args, 0, sizeof(*args));
args->out.handle = handle;
return 0;
error_unlock:
r = amdgpu_gem_handle_lockup(adev, r);
return r;
}
int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data,
......@@ -334,7 +315,7 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data,
AMDGPU_GEM_DOMAIN_CPU, 0,
0, &gobj);
if (r)
goto handle_lockup;
return r;
bo = gem_to_amdgpu_bo(gobj);
bo->prefered_domains = AMDGPU_GEM_DOMAIN_GTT;
......@@ -374,7 +355,7 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data,
/* drop reference from allocate - handle holds it now */
drm_gem_object_unreference_unlocked(gobj);
if (r)
goto handle_lockup;
return r;
args->handle = handle;
return 0;
......@@ -388,9 +369,6 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data,
release_object:
drm_gem_object_unreference_unlocked(gobj);
handle_lockup:
r = amdgpu_gem_handle_lockup(adev, r);
return r;
}
......@@ -456,7 +434,6 @@ unsigned long amdgpu_gem_timeout(uint64_t timeout_ns)
int amdgpu_gem_wait_idle_ioctl(struct drm_device *dev, void *data,
struct drm_file *filp)
{
struct amdgpu_device *adev = dev->dev_private;
union drm_amdgpu_gem_wait_idle *args = data;
struct drm_gem_object *gobj;
struct amdgpu_bo *robj;
......@@ -484,7 +461,6 @@ int amdgpu_gem_wait_idle_ioctl(struct drm_device *dev, void *data,
r = ret;
drm_gem_object_unreference_unlocked(gobj);
r = amdgpu_gem_handle_lockup(adev, r);
return r;
}
......@@ -593,9 +569,6 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
uint64_t va_flags;
int r = 0;
if (!adev->vm_manager.enabled)
return -ENOTTY;
if (args->va_address < AMDGPU_VA_RESERVED_SIZE) {
dev_err(&dev->pdev->dev,
"va_address 0x%lX is in reserved area 0x%X\n",
......@@ -621,6 +594,11 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
args->operation);
return -EINVAL;
}
if ((args->operation == AMDGPU_VA_OP_MAP) ||
(args->operation == AMDGPU_VA_OP_REPLACE)) {
if (amdgpu_kms_vram_lost(adev, fpriv))
return -ENODEV;
}
INIT_LIST_HEAD(&list);
if ((args->operation != AMDGPU_VA_OP_CLEAR) &&
......
......@@ -108,3 +108,209 @@ void amdgpu_gfx_parse_disable_cu(unsigned *mask, unsigned max_se, unsigned max_s
p = next + 1;
}
}
void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
{
int i, queue, pipe, mec;
/* policy for amdgpu compute queue ownership */
for (i = 0; i < AMDGPU_MAX_COMPUTE_QUEUES; ++i) {
queue = i % adev->gfx.mec.num_queue_per_pipe;
pipe = (i / adev->gfx.mec.num_queue_per_pipe)
% adev->gfx.mec.num_pipe_per_mec;
mec = (i / adev->gfx.mec.num_queue_per_pipe)
/ adev->gfx.mec.num_pipe_per_mec;
/* we've run out of HW */
if (mec >= adev->gfx.mec.num_mec)
break;
if (adev->gfx.mec.num_mec > 1) {
/* policy: amdgpu owns the first two queues of the first MEC */
if (mec == 0 && queue < 2)
set_bit(i, adev->gfx.mec.queue_bitmap);
} else {
/* policy: amdgpu owns all queues in the first pipe */
if (mec == 0 && pipe == 0)
set_bit(i, adev->gfx.mec.queue_bitmap);
}
}
/* update the number of active compute rings */
adev->gfx.num_compute_rings =
bitmap_weight(adev->gfx.mec.queue_bitmap, AMDGPU_MAX_COMPUTE_QUEUES);
/* If you hit this case and edited the policy, you probably just
* need to increase AMDGPU_MAX_COMPUTE_RINGS */
if (WARN_ON(adev->gfx.num_compute_rings > AMDGPU_MAX_COMPUTE_RINGS))
adev->gfx.num_compute_rings = AMDGPU_MAX_COMPUTE_RINGS;
}
static int amdgpu_gfx_kiq_acquire(struct amdgpu_device *adev,
struct amdgpu_ring *ring)
{
int queue_bit;
int mec, pipe, queue;
queue_bit = adev->gfx.mec.num_mec
* adev->gfx.mec.num_pipe_per_mec
* adev->gfx.mec.num_queue_per_pipe;
while (queue_bit-- >= 0) {
if (test_bit(queue_bit, adev->gfx.mec.queue_bitmap))
continue;
amdgpu_gfx_bit_to_queue(adev, queue_bit, &mec, &pipe, &queue);
/* Using pipes 2/3 from MEC 2 seems cause problems */
if (mec == 1 && pipe > 1)
continue;
ring->me = mec + 1;
ring->pipe = pipe;
ring->queue = queue;
return 0;
}
dev_err(adev->dev, "Failed to find a queue for KIQ\n");
return -EINVAL;
}
int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
struct amdgpu_ring *ring,
struct amdgpu_irq_src *irq)
{
struct amdgpu_kiq *kiq = &adev->gfx.kiq;
int r = 0;
mutex_init(&kiq->ring_mutex);
r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
if (r)
return r;
ring->adev = NULL;
ring->ring_obj = NULL;
ring->use_doorbell = true;
ring->doorbell_index = AMDGPU_DOORBELL_KIQ;
r = amdgpu_gfx_kiq_acquire(adev, ring);
if (r)
return r;
ring->eop_gpu_addr = kiq->eop_gpu_addr;
sprintf(ring->name, "kiq_%d.%d.%d", ring->me, ring->pipe, ring->queue);
r = amdgpu_ring_init(adev, ring, 1024,
irq, AMDGPU_CP_KIQ_IRQ_DRIVER0);
if (r)
dev_warn(adev->dev, "(%d) failed to init kiq ring\n", r);
return r;
}
void amdgpu_gfx_kiq_free_ring(struct amdgpu_ring *ring,
struct amdgpu_irq_src *irq)
{
amdgpu_wb_free(ring->adev, ring->adev->virt.reg_val_offs);
amdgpu_ring_fini(ring);
}
void amdgpu_gfx_kiq_fini(struct amdgpu_device *adev)
{
struct amdgpu_kiq *kiq = &adev->gfx.kiq;
amdgpu_bo_free_kernel(&kiq->eop_obj, &kiq->eop_gpu_addr, NULL);
}
int amdgpu_gfx_kiq_init(struct amdgpu_device *adev,
unsigned hpd_size)
{
int r;
u32 *hpd;
struct amdgpu_kiq *kiq = &adev->gfx.kiq;
r = amdgpu_bo_create_kernel(adev, hpd_size, PAGE_SIZE,
AMDGPU_GEM_DOMAIN_GTT, &kiq->eop_obj,
&kiq->eop_gpu_addr, (void **)&hpd);
if (r) {
dev_warn(adev->dev, "failed to create KIQ bo (%d).\n", r);
return r;
}
memset(hpd, 0, hpd_size);
r = amdgpu_bo_reserve(kiq->eop_obj, true);
if (unlikely(r != 0))
dev_warn(adev->dev, "(%d) reserve kiq eop bo failed\n", r);
amdgpu_bo_kunmap(kiq->eop_obj);
amdgpu_bo_unreserve(kiq->eop_obj);
return 0;
}
/* create MQD for each compute queue */
int amdgpu_gfx_compute_mqd_sw_init(struct amdgpu_device *adev,
unsigned mqd_size)
{
struct amdgpu_ring *ring = NULL;
int r, i;
/* create MQD for KIQ */
ring = &adev->gfx.kiq.ring;
if (!ring->mqd_obj) {
r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
&ring->mqd_gpu_addr, &ring->mqd_ptr);
if (r) {
dev_warn(adev->dev, "failed to create ring mqd ob (%d)", r);
return r;
}
/* prepare MQD backup */
adev->gfx.mec.mqd_backup[AMDGPU_MAX_COMPUTE_RINGS] = kmalloc(mqd_size, GFP_KERNEL);
if (!adev->gfx.mec.mqd_backup[AMDGPU_MAX_COMPUTE_RINGS])
dev_warn(adev->dev, "no memory to create MQD backup for ring %s\n", ring->name);
}
/* create MQD for each KCQ */
for (i = 0; i < adev->gfx.num_compute_rings; i++) {
ring = &adev->gfx.compute_ring[i];
if (!ring->mqd_obj) {
r = amdgpu_bo_create_kernel(adev, mqd_size, PAGE_SIZE,
AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
&ring->mqd_gpu_addr, &ring->mqd_ptr);
if (r) {
dev_warn(adev->dev, "failed to create ring mqd ob (%d)", r);
return r;
}
/* prepare MQD backup */
adev->gfx.mec.mqd_backup[i] = kmalloc(mqd_size, GFP_KERNEL);
if (!adev->gfx.mec.mqd_backup[i])
dev_warn(adev->dev, "no memory to create MQD backup for ring %s\n", ring->name);
}
}
return 0;
}
void amdgpu_gfx_compute_mqd_sw_fini(struct amdgpu_device *adev)
{
struct amdgpu_ring *ring = NULL;
int i;
for (i = 0; i < adev->gfx.num_compute_rings; i++) {
ring = &adev->gfx.compute_ring[i];
kfree(adev->gfx.mec.mqd_backup[i]);
amdgpu_bo_free_kernel(&ring->mqd_obj,
&ring->mqd_gpu_addr,
&ring->mqd_ptr);
}
ring = &adev->gfx.kiq.ring;
kfree(adev->gfx.mec.mqd_backup[AMDGPU_MAX_COMPUTE_RINGS]);
amdgpu_bo_free_kernel(&ring->mqd_obj,
&ring->mqd_gpu_addr,
&ring->mqd_ptr);
}
......@@ -30,4 +30,64 @@ void amdgpu_gfx_scratch_free(struct amdgpu_device *adev, uint32_t reg);
void amdgpu_gfx_parse_disable_cu(unsigned *mask, unsigned max_se,
unsigned max_sh);
void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev);
int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
struct amdgpu_ring *ring,
struct amdgpu_irq_src *irq);
void amdgpu_gfx_kiq_free_ring(struct amdgpu_ring *ring,
struct amdgpu_irq_src *irq);
void amdgpu_gfx_kiq_fini(struct amdgpu_device *adev);
int amdgpu_gfx_kiq_init(struct amdgpu_device *adev,
unsigned hpd_size);
int amdgpu_gfx_compute_mqd_sw_init(struct amdgpu_device *adev,
unsigned mqd_size);
void amdgpu_gfx_compute_mqd_sw_fini(struct amdgpu_device *adev);
/**
* amdgpu_gfx_create_bitmask - create a bitmask
*
* @bit_width: length of the mask
*
* create a variable length bit mask.
* Returns the bitmask.
*/
static inline u32 amdgpu_gfx_create_bitmask(u32 bit_width)
{
return (u32)((1ULL << bit_width) - 1);
}
static inline int amdgpu_gfx_queue_to_bit(struct amdgpu_device *adev,
int mec, int pipe, int queue)
{
int bit = 0;
bit += mec * adev->gfx.mec.num_pipe_per_mec
* adev->gfx.mec.num_queue_per_pipe;
bit += pipe * adev->gfx.mec.num_queue_per_pipe;
bit += queue;
return bit;
}
static inline void amdgpu_gfx_bit_to_queue(struct amdgpu_device *adev, int bit,
int *mec, int *pipe, int *queue)
{
*queue = bit % adev->gfx.mec.num_queue_per_pipe;
*pipe = (bit / adev->gfx.mec.num_queue_per_pipe)
% adev->gfx.mec.num_pipe_per_mec;
*mec = (bit / adev->gfx.mec.num_queue_per_pipe)
/ adev->gfx.mec.num_pipe_per_mec;
}
static inline bool amdgpu_gfx_is_mec_queue_enabled(struct amdgpu_device *adev,
int mec, int pipe, int queue)
{
return test_bit(amdgpu_gfx_queue_to_bit(adev, mec, pipe, queue),
adev->gfx.mec.queue_bitmap);
}
#endif
......@@ -121,6 +121,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
{
struct amdgpu_device *adev = ring->adev;
struct amdgpu_ib *ib = &ibs[0];
struct dma_fence *tmp = NULL;
bool skip_preamble, need_ctx_switch;
unsigned patch_offset = ~0;
struct amdgpu_vm *vm;
......@@ -160,8 +161,16 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
dev_err(adev->dev, "scheduling IB failed (%d).\n", r);
return r;
}
if (ring->funcs->emit_pipeline_sync && job && job->need_pipeline_sync)
if (ring->funcs->emit_pipeline_sync && job &&
((tmp = amdgpu_sync_get_fence(&job->sched_sync)) ||
amdgpu_vm_need_pipeline_sync(ring, job))) {
amdgpu_ring_emit_pipeline_sync(ring);
dma_fence_put(tmp);
}
if (ring->funcs->insert_start)
ring->funcs->insert_start(ring);
if (vm) {
r = amdgpu_vm_flush(ring, job);
......@@ -188,8 +197,6 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
status |= AMDGPU_HAVE_CTX_SWITCH;
status |= job->preamble_status;
if (vm)
status |= AMDGPU_VM_DOMAIN;
amdgpu_ring_emit_cntxcntl(ring, status);
}
......@@ -208,6 +215,9 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
need_ctx_switch = false;
}
if (ring->funcs->emit_tmz)
amdgpu_ring_emit_tmz(ring, false);
if (ring->funcs->emit_hdp_invalidate
#ifdef CONFIG_X86_64
&& !(adev->flags & AMD_IS_APU)
......
......@@ -62,8 +62,9 @@ enum amdgpu_ih_clientid
AMDGPU_IH_CLIENTID_MP0 = 0x1e,
AMDGPU_IH_CLIENTID_MP1 = 0x1f,
AMDGPU_IH_CLIENTID_MAX
AMDGPU_IH_CLIENTID_MAX,
AMDGPU_IH_CLIENTID_VCN = AMDGPU_IH_CLIENTID_UVD
};
#define AMDGPU_IH_CLIENTID_LEGACY 0
......
......@@ -83,7 +83,8 @@ static void amdgpu_irq_reset_work_func(struct work_struct *work)
struct amdgpu_device *adev = container_of(work, struct amdgpu_device,
reset_work);
amdgpu_gpu_reset(adev);
if (!amdgpu_sriov_vf(adev))
amdgpu_gpu_reset(adev);
}
/* Disable *all* interrupts */
......
......@@ -36,7 +36,11 @@ static void amdgpu_job_timedout(struct amd_sched_job *s_job)
job->base.sched->name,
atomic_read(&job->ring->fence_drv.last_seq),
job->ring->fence_drv.sync_seq);
amdgpu_gpu_reset(job->adev);
if (amdgpu_sriov_vf(job->adev))
amdgpu_sriov_gpu_reset(job->adev, job);
else
amdgpu_gpu_reset(job->adev);
}
int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,
......@@ -57,9 +61,10 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs,
(*job)->vm = vm;
(*job)->ibs = (void *)&(*job)[1];
(*job)->num_ibs = num_ibs;
(*job)->need_pipeline_sync = false;
amdgpu_sync_create(&(*job)->sync);
amdgpu_sync_create(&(*job)->dep_sync);
amdgpu_sync_create(&(*job)->sched_sync);
return 0;
}
......@@ -98,6 +103,8 @@ static void amdgpu_job_free_cb(struct amd_sched_job *s_job)
dma_fence_put(job->fence);
amdgpu_sync_free(&job->sync);
amdgpu_sync_free(&job->dep_sync);
amdgpu_sync_free(&job->sched_sync);
kfree(job);
}
......@@ -107,6 +114,8 @@ void amdgpu_job_free(struct amdgpu_job *job)
dma_fence_put(job->fence);
amdgpu_sync_free(&job->sync);
amdgpu_sync_free(&job->dep_sync);
amdgpu_sync_free(&job->sched_sync);
kfree(job);
}
......@@ -138,11 +147,18 @@ static struct dma_fence *amdgpu_job_dependency(struct amd_sched_job *sched_job)
struct amdgpu_job *job = to_amdgpu_job(sched_job);
struct amdgpu_vm *vm = job->vm;
struct dma_fence *fence = amdgpu_sync_get_fence(&job->sync);
struct dma_fence *fence = amdgpu_sync_get_fence(&job->dep_sync);
int r;
if (amd_sched_dependency_optimized(fence, sched_job->s_entity)) {
r = amdgpu_sync_fence(job->adev, &job->sched_sync, fence);
if (r)
DRM_ERROR("Error adding fence to sync (%d)\n", r);
}
if (!fence)
fence = amdgpu_sync_get_fence(&job->sync);
while (fence == NULL && vm && !job->vm_id) {
struct amdgpu_ring *ring = job->ring;
int r;
r = amdgpu_vm_grab_id(vm, ring, &job->sync,
&job->base.s_fence->finished,
......@@ -153,9 +169,6 @@ static struct dma_fence *amdgpu_job_dependency(struct amd_sched_job *sched_job)
fence = amdgpu_sync_get_fence(&job->sync);
}
if (amd_sched_dependency_optimized(fence, sched_job->s_entity))
job->need_pipeline_sync = true;
return fence;
}
......@@ -163,6 +176,7 @@ static struct dma_fence *amdgpu_job_run(struct amd_sched_job *sched_job)
{
struct dma_fence *fence = NULL;
struct amdgpu_job *job;
struct amdgpu_fpriv *fpriv = NULL;
int r;
if (!sched_job) {
......@@ -174,10 +188,16 @@ static struct dma_fence *amdgpu_job_run(struct amd_sched_job *sched_job)
BUG_ON(amdgpu_sync_peek_fence(&job->sync, NULL));
trace_amdgpu_sched_run_job(job);
r = amdgpu_ib_schedule(job->ring, job->num_ibs, job->ibs, job, &fence);
if (r)
DRM_ERROR("Error scheduling IBs (%d)\n", r);
if (job->vm)
fpriv = container_of(job->vm, struct amdgpu_fpriv, vm);
/* skip ib schedule when vram is lost */
if (fpriv && amdgpu_kms_vram_lost(job->adev, fpriv))
DRM_ERROR("Skip scheduling IBs!\n");
else {
r = amdgpu_ib_schedule(job->ring, job->num_ibs, job->ibs, job, &fence);
if (r)
DRM_ERROR("Error scheduling IBs (%d)\n", r);
}
/* if gpu reset, hw fence will be replaced here */
dma_fence_put(job->fence);
job->fence = dma_fence_get(fence);
......
......@@ -87,6 +87,41 @@ int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags)
struct amdgpu_device *adev;
int r, acpi_status;
#ifdef CONFIG_DRM_AMDGPU_SI
if (!amdgpu_si_support) {
switch (flags & AMD_ASIC_MASK) {
case CHIP_TAHITI:
case CHIP_PITCAIRN:
case CHIP_VERDE:
case CHIP_OLAND:
case CHIP_HAINAN:
dev_info(dev->dev,
"SI support provided by radeon.\n");
dev_info(dev->dev,
"Use radeon.si_support=0 amdgpu.si_support=1 to override.\n"
);
return -ENODEV;
}
}
#endif
#ifdef CONFIG_DRM_AMDGPU_CIK
if (!amdgpu_cik_support) {
switch (flags & AMD_ASIC_MASK) {
case CHIP_KAVERI:
case CHIP_BONAIRE:
case CHIP_HAWAII:
case CHIP_KABINI:
case CHIP_MULLINS:
dev_info(dev->dev,
"CIK support provided by radeon.\n");
dev_info(dev->dev,
"Use radeon.cik_support=0 amdgpu.cik_support=1 to override.\n"
);
return -ENODEV;
}
}
#endif
adev = kzalloc(sizeof(struct amdgpu_device), GFP_KERNEL);
if (adev == NULL) {
return -ENOMEM;
......@@ -235,6 +270,7 @@ static int amdgpu_firmware_info(struct drm_amdgpu_info_firmware *fw_info,
static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
{
struct amdgpu_device *adev = dev->dev_private;
struct amdgpu_fpriv *fpriv = filp->driver_priv;
struct drm_amdgpu_info *info = data;
struct amdgpu_mode_info *minfo = &adev->mode_info;
void __user *out = (void __user *)(uintptr_t)info->return_pointer;
......@@ -247,6 +283,8 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
if (!info->return_size || !info->return_pointer)
return -EINVAL;
if (amdgpu_kms_vram_lost(adev, fpriv))
return -ENODEV;
switch (info->query) {
case AMDGPU_INFO_ACCEL_WORKING:
......@@ -319,6 +357,19 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
ib_start_alignment = AMDGPU_GPU_PAGE_SIZE;
ib_size_alignment = 1;
break;
case AMDGPU_HW_IP_VCN_DEC:
type = AMD_IP_BLOCK_TYPE_VCN;
ring_mask = adev->vcn.ring_dec.ready ? 1 : 0;
ib_start_alignment = AMDGPU_GPU_PAGE_SIZE;
ib_size_alignment = 16;
break;
case AMDGPU_HW_IP_VCN_ENC:
type = AMD_IP_BLOCK_TYPE_VCN;
for (i = 0; i < adev->vcn.num_enc_rings; i++)
ring_mask |= ((adev->vcn.ring_enc[i].ready ? 1 : 0) << i);
ib_start_alignment = AMDGPU_GPU_PAGE_SIZE;
ib_size_alignment = 1;
break;
default:
return -EINVAL;
}
......@@ -361,6 +412,10 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
case AMDGPU_HW_IP_UVD_ENC:
type = AMD_IP_BLOCK_TYPE_UVD;
break;
case AMDGPU_HW_IP_VCN_DEC:
case AMDGPU_HW_IP_VCN_ENC:
type = AMD_IP_BLOCK_TYPE_VCN;
break;
default:
return -EINVAL;
}
......@@ -397,6 +452,9 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
case AMDGPU_INFO_NUM_EVICTIONS:
ui64 = atomic64_read(&adev->num_evictions);
return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
case AMDGPU_INFO_NUM_VRAM_CPU_PAGE_FAULTS:
ui64 = atomic64_read(&adev->num_vram_cpu_page_faults);
return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
case AMDGPU_INFO_VRAM_USAGE:
ui64 = atomic64_read(&adev->vram_usage);
return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
......@@ -730,6 +788,12 @@ void amdgpu_driver_lastclose_kms(struct drm_device *dev)
vga_switcheroo_process_delayed_switch();
}
bool amdgpu_kms_vram_lost(struct amdgpu_device *adev,
struct amdgpu_fpriv *fpriv)
{
return fpriv->vram_lost_counter != atomic_read(&adev->vram_lost_counter);
}
/**
* amdgpu_driver_open_kms - drm callback for open
*
......@@ -757,7 +821,8 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
goto out_suspend;
}
r = amdgpu_vm_init(adev, &fpriv->vm);
r = amdgpu_vm_init(adev, &fpriv->vm,
AMDGPU_VM_CONTEXT_GFX);
if (r) {
kfree(fpriv);
goto out_suspend;
......@@ -782,6 +847,7 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
amdgpu_ctx_mgr_init(&fpriv->ctx_mgr);
fpriv->vram_lost_counter = atomic_read(&adev->vram_lost_counter);
file_priv->driver_priv = fpriv;
out_suspend:
......@@ -814,8 +880,10 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
amdgpu_ctx_mgr_fini(&fpriv->ctx_mgr);
amdgpu_uvd_free_handles(adev, file_priv);
amdgpu_vce_free_handles(adev, file_priv);
if (adev->asic_type != CHIP_RAVEN) {
amdgpu_uvd_free_handles(adev, file_priv);
amdgpu_vce_free_handles(adev, file_priv);
}
amdgpu_vm_bo_rmv(adev, fpriv->prt_va);
......@@ -945,50 +1013,10 @@ void amdgpu_disable_vblank_kms(struct drm_device *dev, unsigned int pipe)
amdgpu_irq_put(adev, &adev->crtc_irq, idx);
}
/**
* amdgpu_get_vblank_timestamp_kms - get vblank timestamp
*
* @dev: drm dev pointer
* @crtc: crtc to get the timestamp for
* @max_error: max error
* @vblank_time: time value
* @flags: flags passed to the driver
*
* Gets the timestamp on the requested crtc based on the
* scanout position. (all asics).
* Returns postive status flags on success, negative error on failure.
*/
int amdgpu_get_vblank_timestamp_kms(struct drm_device *dev, unsigned int pipe,
int *max_error,
struct timeval *vblank_time,
unsigned flags)
{
struct drm_crtc *crtc;
struct amdgpu_device *adev = dev->dev_private;
if (pipe >= dev->num_crtcs) {
DRM_ERROR("Invalid crtc %u\n", pipe);
return -EINVAL;
}
/* Get associated drm_crtc: */
crtc = &adev->mode_info.crtcs[pipe]->base;
if (!crtc) {
/* This can occur on driver load if some component fails to
* initialize completely and driver is unloaded */
DRM_ERROR("Uninitialized crtc %d\n", pipe);
return -EINVAL;
}
/* Helper routine in DRM core does all the work: */
return drm_calc_vbltimestamp_from_scanoutpos(dev, pipe, max_error,
vblank_time, flags,
&crtc->hwmode);
}
const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_VM, amdgpu_vm_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_BO_LIST, amdgpu_bo_list_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
/* KMS */
DRM_IOCTL_DEF_DRV(AMDGPU_GEM_MMAP, amdgpu_gem_mmap_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
......
......@@ -534,6 +534,9 @@ struct amdgpu_framebuffer {
((em) == ATOM_ENCODER_MODE_DP_MST))
/* Driver internal use only flags of amdgpu_get_crtc_scanoutpos() */
#define DRM_SCANOUTPOS_VALID (1 << 0)
#define DRM_SCANOUTPOS_IN_VBLANK (1 << 1)
#define DRM_SCANOUTPOS_ACCURATE (1 << 2)
#define USE_REAL_VBLANKSTART (1 << 30)
#define GET_DISTANCE_TO_VBLANKSTART (1 << 31)
......
......@@ -960,6 +960,7 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
return -EINVAL;
/* hurrah the memory is not visible ! */
atomic64_inc(&adev->num_vram_cpu_page_faults);
amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM);
lpfn = adev->mc.visible_vram_size >> PAGE_SHIFT;
for (i = 0; i < abo->placement.num_placement; i++) {
......
......@@ -72,6 +72,7 @@ static int amdgpu_pp_early_init(void *handle)
case CHIP_CARRIZO:
case CHIP_STONEY:
case CHIP_VEGA10:
case CHIP_RAVEN:
adev->pp_enabled = true;
if (amdgpu_create_pp_handle(adev))
return -EINVAL;
......
......@@ -24,12 +24,13 @@
*/
#include <linux/firmware.h>
#include "drmP.h"
#include <drm/drmP.h>
#include "amdgpu.h"
#include "amdgpu_psp.h"
#include "amdgpu_ucode.h"
#include "soc15_common.h"
#include "psp_v3_1.h"
#include "psp_v10_0.h"
static void psp_set_funcs(struct amdgpu_device *adev);
......@@ -61,6 +62,12 @@ static int psp_sw_init(void *handle)
psp->compare_sram_data = psp_v3_1_compare_sram_data;
psp->smu_reload_quirk = psp_v3_1_smu_reload_quirk;
break;
case CHIP_RAVEN:
psp->prep_cmd_buf = psp_v10_0_prep_cmd_buf;
psp->ring_init = psp_v10_0_ring_init;
psp->cmd_submit = psp_v10_0_cmd_submit;
psp->compare_sram_data = psp_v10_0_compare_sram_data;
break;
default:
return -EINVAL;
}
......@@ -230,6 +237,13 @@ static int psp_asd_load(struct psp_context *psp)
int ret;
struct psp_gfx_cmd_resp *cmd;
/* If PSP version doesn't match ASD version, asd loading will be failed.
* add workaround to bypass it for sriov now.
* TODO: add version check to make it common
*/
if (amdgpu_sriov_vf(psp->adev))
return 0;
cmd = kzalloc(sizeof(struct psp_gfx_cmd_resp), GFP_KERNEL);
if (!cmd)
return -ENOMEM;
......@@ -542,3 +556,12 @@ const struct amdgpu_ip_block_version psp_v3_1_ip_block =
.rev = 0,
.funcs = &psp_ip_funcs,
};
const struct amdgpu_ip_block_version psp_v10_0_ip_block =
{
.type = AMD_IP_BLOCK_TYPE_PSP,
.major = 10,
.minor = 0,
.rev = 0,
.funcs = &psp_ip_funcs,
};
......@@ -138,4 +138,6 @@ extern const struct amdgpu_ip_block_version psp_v3_1_ip_block;
extern int psp_wait_for(struct psp_context *psp, uint32_t reg_index,
uint32_t field_val, uint32_t mask, bool check_changed);
extern const struct amdgpu_ip_block_version psp_v10_0_ip_block;
#endif
/*
* Copyright 2017 Valve Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*
* Authors: Andres Rodriguez
*/
#include "amdgpu.h"
#include "amdgpu_ring.h"
static int amdgpu_queue_mapper_init(struct amdgpu_queue_mapper *mapper,
int hw_ip)
{
if (!mapper)
return -EINVAL;
if (hw_ip > AMDGPU_MAX_IP_NUM)
return -EINVAL;
mapper->hw_ip = hw_ip;
mutex_init(&mapper->lock);
memset(mapper->queue_map, 0, sizeof(mapper->queue_map));
return 0;
}
static struct amdgpu_ring *amdgpu_get_cached_map(struct amdgpu_queue_mapper *mapper,
int ring)
{
return mapper->queue_map[ring];
}
static int amdgpu_update_cached_map(struct amdgpu_queue_mapper *mapper,
int ring, struct amdgpu_ring *pring)
{
if (WARN_ON(mapper->queue_map[ring])) {
DRM_ERROR("Un-expected ring re-map\n");
return -EINVAL;
}
mapper->queue_map[ring] = pring;
return 0;
}
static int amdgpu_identity_map(struct amdgpu_device *adev,
struct amdgpu_queue_mapper *mapper,
int ring,
struct amdgpu_ring **out_ring)
{
switch (mapper->hw_ip) {
case AMDGPU_HW_IP_GFX:
*out_ring = &adev->gfx.gfx_ring[ring];
break;
case AMDGPU_HW_IP_COMPUTE:
*out_ring = &adev->gfx.compute_ring[ring];
break;
case AMDGPU_HW_IP_DMA:
*out_ring = &adev->sdma.instance[ring].ring;
break;
case AMDGPU_HW_IP_UVD:
*out_ring = &adev->uvd.ring;
break;
case AMDGPU_HW_IP_VCE:
*out_ring = &adev->vce.ring[ring];
break;
case AMDGPU_HW_IP_UVD_ENC:
*out_ring = &adev->uvd.ring_enc[ring];
break;
case AMDGPU_HW_IP_VCN_DEC:
*out_ring = &adev->vcn.ring_dec;
break;
case AMDGPU_HW_IP_VCN_ENC:
*out_ring = &adev->vcn.ring_enc[ring];
break;
default:
*out_ring = NULL;
DRM_ERROR("unknown HW IP type: %d\n", mapper->hw_ip);
return -EINVAL;
}
return amdgpu_update_cached_map(mapper, ring, *out_ring);
}
static enum amdgpu_ring_type amdgpu_hw_ip_to_ring_type(int hw_ip)
{
switch (hw_ip) {
case AMDGPU_HW_IP_GFX:
return AMDGPU_RING_TYPE_GFX;
case AMDGPU_HW_IP_COMPUTE:
return AMDGPU_RING_TYPE_COMPUTE;
case AMDGPU_HW_IP_DMA:
return AMDGPU_RING_TYPE_SDMA;
case AMDGPU_HW_IP_UVD:
return AMDGPU_RING_TYPE_UVD;
case AMDGPU_HW_IP_VCE:
return AMDGPU_RING_TYPE_VCE;
default:
DRM_ERROR("Invalid HW IP specified %d\n", hw_ip);
return -1;
}
}
static int amdgpu_lru_map(struct amdgpu_device *adev,
struct amdgpu_queue_mapper *mapper,
int user_ring,
struct amdgpu_ring **out_ring)
{
int r, i, j;
int ring_type = amdgpu_hw_ip_to_ring_type(mapper->hw_ip);
int ring_blacklist[AMDGPU_MAX_RINGS];
struct amdgpu_ring *ring;
/* 0 is a valid ring index, so initialize to -1 */
memset(ring_blacklist, 0xff, sizeof(ring_blacklist));
for (i = 0, j = 0; i < AMDGPU_MAX_RINGS; i++) {
ring = mapper->queue_map[i];
if (ring)
ring_blacklist[j++] = ring->idx;
}
r = amdgpu_ring_lru_get(adev, ring_type, ring_blacklist,
j, out_ring);
if (r)
return r;
return amdgpu_update_cached_map(mapper, user_ring, *out_ring);
}
/**
* amdgpu_queue_mgr_init - init an amdgpu_queue_mgr struct
*
* @adev: amdgpu_device pointer
* @mgr: amdgpu_queue_mgr structure holding queue information
*
* Initialize the the selected @mgr (all asics).
*
* Returns 0 on success, error on failure.
*/
int amdgpu_queue_mgr_init(struct amdgpu_device *adev,
struct amdgpu_queue_mgr *mgr)
{
int i, r;
if (!adev || !mgr)
return -EINVAL;
memset(mgr, 0, sizeof(*mgr));
for (i = 0; i < AMDGPU_MAX_IP_NUM; ++i) {
r = amdgpu_queue_mapper_init(&mgr->mapper[i], i);
if (r)
return r;
}
return 0;
}
/**
* amdgpu_queue_mgr_fini - de-initialize an amdgpu_queue_mgr struct
*
* @adev: amdgpu_device pointer
* @mgr: amdgpu_queue_mgr structure holding queue information
*
* De-initialize the the selected @mgr (all asics).
*
* Returns 0 on success, error on failure.
*/
int amdgpu_queue_mgr_fini(struct amdgpu_device *adev,
struct amdgpu_queue_mgr *mgr)
{
return 0;
}
/**
* amdgpu_queue_mgr_map - Map a userspace ring id to an amdgpu_ring
*
* @adev: amdgpu_device pointer
* @mgr: amdgpu_queue_mgr structure holding queue information
* @hw_ip: HW IP enum
* @instance: HW instance
* @ring: user ring id
* @our_ring: pointer to mapped amdgpu_ring
*
* Map a userspace ring id to an appropriate kernel ring. Different
* policies are configurable at a HW IP level.
*
* Returns 0 on success, error on failure.
*/
int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
struct amdgpu_queue_mgr *mgr,
int hw_ip, int instance, int ring,
struct amdgpu_ring **out_ring)
{
int r, ip_num_rings;
struct amdgpu_queue_mapper *mapper = &mgr->mapper[hw_ip];
if (!adev || !mgr || !out_ring)
return -EINVAL;
if (hw_ip >= AMDGPU_MAX_IP_NUM)
return -EINVAL;
if (ring >= AMDGPU_MAX_RINGS)
return -EINVAL;
/* Right now all IPs have only one instance - multiple rings. */
if (instance != 0) {
DRM_ERROR("invalid ip instance: %d\n", instance);
return -EINVAL;
}
switch (hw_ip) {
case AMDGPU_HW_IP_GFX:
ip_num_rings = adev->gfx.num_gfx_rings;
break;
case AMDGPU_HW_IP_COMPUTE:
ip_num_rings = adev->gfx.num_compute_rings;
break;
case AMDGPU_HW_IP_DMA:
ip_num_rings = adev->sdma.num_instances;
break;
case AMDGPU_HW_IP_UVD:
ip_num_rings = 1;
break;
case AMDGPU_HW_IP_VCE:
ip_num_rings = adev->vce.num_rings;
break;
case AMDGPU_HW_IP_UVD_ENC:
ip_num_rings = adev->uvd.num_enc_rings;
break;
case AMDGPU_HW_IP_VCN_DEC:
ip_num_rings = 1;
break;
case AMDGPU_HW_IP_VCN_ENC:
ip_num_rings = adev->vcn.num_enc_rings;
break;
default:
DRM_ERROR("unknown ip type: %d\n", hw_ip);
return -EINVAL;
}
if (ring >= ip_num_rings) {
DRM_ERROR("Ring index:%d exceeds maximum:%d for ip:%d\n",
ring, ip_num_rings, hw_ip);
return -EINVAL;
}
mutex_lock(&mapper->lock);
*out_ring = amdgpu_get_cached_map(mapper, ring);
if (*out_ring) {
/* cache hit */
r = 0;
goto out_unlock;
}
switch (mapper->hw_ip) {
case AMDGPU_HW_IP_GFX:
case AMDGPU_HW_IP_UVD:
case AMDGPU_HW_IP_VCE:
case AMDGPU_HW_IP_UVD_ENC:
case AMDGPU_HW_IP_VCN_DEC:
case AMDGPU_HW_IP_VCN_ENC:
r = amdgpu_identity_map(adev, mapper, ring, out_ring);
break;
case AMDGPU_HW_IP_DMA:
case AMDGPU_HW_IP_COMPUTE:
r = amdgpu_lru_map(adev, mapper, ring, out_ring);
break;
default:
*out_ring = NULL;
r = -EINVAL;
DRM_ERROR("unknown HW IP type: %d\n", mapper->hw_ip);
}
out_unlock:
mutex_unlock(&mapper->lock);
return r;
}
......@@ -135,6 +135,8 @@ void amdgpu_ring_commit(struct amdgpu_ring *ring)
if (ring->funcs->end_use)
ring->funcs->end_use(ring);
amdgpu_ring_lru_touch(ring->adev, ring);
}
/**
......@@ -253,10 +255,13 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
}
ring->max_dw = max_dw;
INIT_LIST_HEAD(&ring->lru_list);
amdgpu_ring_lru_touch(adev, ring);
if (amdgpu_debugfs_ring_init(adev, ring)) {
DRM_ERROR("Failed to register debugfs file for rings !\n");
}
return 0;
}
......@@ -294,6 +299,84 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring)
ring->adev->rings[ring->idx] = NULL;
}
static void amdgpu_ring_lru_touch_locked(struct amdgpu_device *adev,
struct amdgpu_ring *ring)
{
/* list_move_tail handles the case where ring isn't part of the list */
list_move_tail(&ring->lru_list, &adev->ring_lru_list);
}
static bool amdgpu_ring_is_blacklisted(struct amdgpu_ring *ring,
int *blacklist, int num_blacklist)
{
int i;
for (i = 0; i < num_blacklist; i++) {
if (ring->idx == blacklist[i])
return true;
}
return false;
}
/**
* amdgpu_ring_lru_get - get the least recently used ring for a HW IP block
*
* @adev: amdgpu_device pointer
* @type: amdgpu_ring_type enum
* @blacklist: blacklisted ring ids array
* @num_blacklist: number of entries in @blacklist
* @ring: output ring
*
* Retrieve the amdgpu_ring structure for the least recently used ring of
* a specific IP block (all asics).
* Returns 0 on success, error on failure.
*/
int amdgpu_ring_lru_get(struct amdgpu_device *adev, int type, int *blacklist,
int num_blacklist, struct amdgpu_ring **ring)
{
struct amdgpu_ring *entry;
/* List is sorted in LRU order, find first entry corresponding
* to the desired HW IP */
*ring = NULL;
spin_lock(&adev->ring_lru_list_lock);
list_for_each_entry(entry, &adev->ring_lru_list, lru_list) {
if (entry->funcs->type != type)
continue;
if (amdgpu_ring_is_blacklisted(entry, blacklist, num_blacklist))
continue;
*ring = entry;
amdgpu_ring_lru_touch_locked(adev, *ring);
break;
}
spin_unlock(&adev->ring_lru_list_lock);
if (!*ring) {
DRM_ERROR("Ring LRU contains no entries for ring type:%d\n", type);
return -EINVAL;
}
return 0;
}
/**
* amdgpu_ring_lru_touch - mark a ring as recently being used
*
* @adev: amdgpu_device pointer
* @ring: ring to touch
*
* Move @ring to the tail of the lru list
*/
void amdgpu_ring_lru_touch(struct amdgpu_device *adev, struct amdgpu_ring *ring)
{
spin_lock(&adev->ring_lru_list_lock);
amdgpu_ring_lru_touch_locked(adev, ring);
spin_unlock(&adev->ring_lru_list_lock);
}
/*
* Debugfs info
*/
......
......@@ -47,7 +47,9 @@ enum amdgpu_ring_type {
AMDGPU_RING_TYPE_UVD,
AMDGPU_RING_TYPE_VCE,
AMDGPU_RING_TYPE_KIQ,
AMDGPU_RING_TYPE_UVD_ENC
AMDGPU_RING_TYPE_UVD_ENC,
AMDGPU_RING_TYPE_VCN_DEC,
AMDGPU_RING_TYPE_VCN_ENC
};
struct amdgpu_device;
......@@ -76,6 +78,7 @@ struct amdgpu_fence_driver {
int amdgpu_fence_driver_init(struct amdgpu_device *adev);
void amdgpu_fence_driver_fini(struct amdgpu_device *adev);
void amdgpu_fence_driver_force_completion(struct amdgpu_device *adev);
void amdgpu_fence_driver_force_completion_ring(struct amdgpu_ring *ring);
int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
unsigned num_hw_submission);
......@@ -130,6 +133,7 @@ struct amdgpu_ring_funcs {
int (*test_ib)(struct amdgpu_ring *ring, long timeout);
/* insert NOP packets */
void (*insert_nop)(struct amdgpu_ring *ring, uint32_t count);
void (*insert_start)(struct amdgpu_ring *ring);
void (*insert_end)(struct amdgpu_ring *ring);
/* pad the indirect buffer to the necessary number of dw */
void (*pad_ib)(struct amdgpu_ring *ring, struct amdgpu_ib *ib);
......@@ -142,6 +146,7 @@ struct amdgpu_ring_funcs {
void (*emit_cntxcntl) (struct amdgpu_ring *ring, uint32_t flags);
void (*emit_rreg)(struct amdgpu_ring *ring, uint32_t reg);
void (*emit_wreg)(struct amdgpu_ring *ring, uint32_t reg, uint32_t val);
void (*emit_tmz)(struct amdgpu_ring *ring, bool start);
};
struct amdgpu_ring {
......@@ -149,6 +154,7 @@ struct amdgpu_ring {
const struct amdgpu_ring_funcs *funcs;
struct amdgpu_fence_driver fence_drv;
struct amd_gpu_scheduler sched;
struct list_head lru_list;
struct amdgpu_bo *ring_obj;
volatile uint32_t *ring;
......@@ -180,6 +186,7 @@ struct amdgpu_ring {
u64 cond_exe_gpu_addr;
volatile u32 *cond_exe_cpu_addr;
unsigned vm_inv_eng;
bool has_compute_vm_bug;
#if defined(CONFIG_DEBUG_FS)
struct dentry *ent;
#endif
......@@ -194,6 +201,9 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
unsigned ring_size, struct amdgpu_irq_src *irq_src,
unsigned irq_type);
void amdgpu_ring_fini(struct amdgpu_ring *ring);
int amdgpu_ring_lru_get(struct amdgpu_device *adev, int type, int *blacklist,
int num_blacklist, struct amdgpu_ring **ring);
void amdgpu_ring_lru_touch(struct amdgpu_device *adev, struct amdgpu_ring *ring);
static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
{
int i = 0;
......
......@@ -298,6 +298,25 @@ struct dma_fence *amdgpu_sync_get_fence(struct amdgpu_sync *sync)
return NULL;
}
int amdgpu_sync_wait(struct amdgpu_sync *sync, bool intr)
{
struct amdgpu_sync_entry *e;
struct hlist_node *tmp;
int i, r;
hash_for_each_safe(sync->fences, i, tmp, e, node) {
r = dma_fence_wait(e->fence, intr);
if (r)
return r;
hash_del(&e->node);
dma_fence_put(e->fence);
kmem_cache_free(amdgpu_sync_slab, e);
}
return 0;
}
/**
* amdgpu_sync_free - free the sync object
*
......
......@@ -49,6 +49,7 @@ int amdgpu_sync_resv(struct amdgpu_device *adev,
struct dma_fence *amdgpu_sync_peek_fence(struct amdgpu_sync *sync,
struct amdgpu_ring *ring);
struct dma_fence *amdgpu_sync_get_fence(struct amdgpu_sync *sync);
int amdgpu_sync_wait(struct amdgpu_sync *sync, bool intr);
void amdgpu_sync_free(struct amdgpu_sync *sync);
int amdgpu_sync_init(void);
void amdgpu_sync_fini(void);
......
......@@ -29,11 +29,11 @@
* Thomas Hellstrom <thomas-at-tungstengraphics-dot-com>
* Dave Airlie
*/
#include <ttm/ttm_bo_api.h>
#include <ttm/ttm_bo_driver.h>
#include <ttm/ttm_placement.h>
#include <ttm/ttm_module.h>
#include <ttm/ttm_page_alloc.h>
#include <drm/ttm/ttm_bo_api.h>
#include <drm/ttm/ttm_bo_driver.h>
#include <drm/ttm/ttm_placement.h>
#include <drm/ttm/ttm_module.h>
#include <drm/ttm/ttm_page_alloc.h>
#include <drm/drmP.h>
#include <drm/amdgpu_drm.h>
#include <linux/seq_file.h>
......@@ -745,6 +745,7 @@ int amdgpu_ttm_bind(struct ttm_buffer_object *bo, struct ttm_mem_reg *bo_mem)
return r;
}
spin_lock(&gtt->adev->gtt_list_lock);
flags = amdgpu_ttm_tt_pte_flags(gtt->adev, ttm, bo_mem);
gtt->offset = (u64)bo_mem->start << PAGE_SHIFT;
r = amdgpu_gart_bind(gtt->adev, gtt->offset, ttm->num_pages,
......@@ -753,12 +754,13 @@ int amdgpu_ttm_bind(struct ttm_buffer_object *bo, struct ttm_mem_reg *bo_mem)
if (r) {
DRM_ERROR("failed to bind %lu pages at 0x%08llX\n",
ttm->num_pages, gtt->offset);
return r;
goto error_gart_bind;
}
spin_lock(&gtt->adev->gtt_list_lock);
list_add_tail(&gtt->list, &gtt->adev->gtt_list);
error_gart_bind:
spin_unlock(&gtt->adev->gtt_list_lock);
return 0;
return r;
}
int amdgpu_ttm_recover_gart(struct amdgpu_device *adev)
......@@ -789,6 +791,7 @@ int amdgpu_ttm_recover_gart(struct amdgpu_device *adev)
static int amdgpu_ttm_backend_unbind(struct ttm_tt *ttm)
{
struct amdgpu_ttm_tt *gtt = (void *)ttm;
int r;
if (gtt->userptr)
amdgpu_ttm_tt_unpin_userptr(ttm);
......@@ -797,14 +800,17 @@ static int amdgpu_ttm_backend_unbind(struct ttm_tt *ttm)
return 0;
/* unbind shouldn't be done for GDS/GWS/OA in ttm_bo_clean_mm */
if (gtt->adev->gart.ready)
amdgpu_gart_unbind(gtt->adev, gtt->offset, ttm->num_pages);
spin_lock(&gtt->adev->gtt_list_lock);
r = amdgpu_gart_unbind(gtt->adev, gtt->offset, ttm->num_pages);
if (r) {
DRM_ERROR("failed to unbind %lu pages at 0x%08llX\n",
gtt->ttm.ttm.num_pages, gtt->offset);
goto error_unbind;
}
list_del_init(&gtt->list);
error_unbind:
spin_unlock(&gtt->adev->gtt_list_lock);
return 0;
return r;
}
static void amdgpu_ttm_backend_destroy(struct ttm_tt *ttm)
......@@ -1115,7 +1121,7 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
/* Change the size here instead of the init above so only lpfn is affected */
amdgpu_ttm_set_active_vram_size(adev, adev->mc.visible_vram_size);
r = amdgpu_bo_create(adev, 256 * 1024, PAGE_SIZE, true,
r = amdgpu_bo_create(adev, adev->mc.stolen_size, PAGE_SIZE, true,
AMDGPU_GEM_DOMAIN_VRAM,
AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
......@@ -1462,6 +1468,9 @@ static ssize_t amdgpu_ttm_vram_read(struct file *f, char __user *buf,
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
if (*pos >= adev->mc.mc_vram_size)
return -ENXIO;
while (size) {
unsigned long flags;
uint32_t value;
......
......@@ -197,6 +197,27 @@ void amdgpu_ucode_print_sdma_hdr(const struct common_firmware_header *hdr)
}
}
void amdgpu_ucode_print_gpu_info_hdr(const struct common_firmware_header *hdr)
{
uint16_t version_major = le16_to_cpu(hdr->header_version_major);
uint16_t version_minor = le16_to_cpu(hdr->header_version_minor);
DRM_DEBUG("GPU_INFO\n");
amdgpu_ucode_print_common_hdr(hdr);
if (version_major == 1) {
const struct gpu_info_firmware_header_v1_0 *gpu_info_hdr =
container_of(hdr, struct gpu_info_firmware_header_v1_0, header);
DRM_DEBUG("version_major: %u\n",
le16_to_cpu(gpu_info_hdr->version_major));
DRM_DEBUG("version_minor: %u\n",
le16_to_cpu(gpu_info_hdr->version_minor));
} else {
DRM_ERROR("Unknown gpu_info ucode version: %u.%u\n", version_major, version_minor);
}
}
int amdgpu_ucode_validate(const struct firmware *fw)
{
const struct common_firmware_header *hdr =
......@@ -253,6 +274,15 @@ amdgpu_ucode_get_load_type(struct amdgpu_device *adev, int load_type)
return AMDGPU_FW_LOAD_DIRECT;
else
return AMDGPU_FW_LOAD_PSP;
case CHIP_RAVEN:
#if 0
if (!load_type)
return AMDGPU_FW_LOAD_DIRECT;
else
return AMDGPU_FW_LOAD_PSP;
#else
return AMDGPU_FW_LOAD_DIRECT;
#endif
default:
DRM_ERROR("Unknow firmware load type\n");
}
......@@ -349,7 +379,8 @@ int amdgpu_ucode_init_bo(struct amdgpu_device *adev)
err = amdgpu_bo_create(adev, adev->firmware.fw_size, PAGE_SIZE, true,
amdgpu_sriov_vf(adev) ? AMDGPU_GEM_DOMAIN_VRAM : AMDGPU_GEM_DOMAIN_GTT,
0, NULL, NULL, bo);
AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
NULL, NULL, bo);
if (err) {
dev_err(adev->dev, "(%d) Firmware buffer allocate failed\n", err);
goto failed;
......
......@@ -113,6 +113,32 @@ struct sdma_firmware_header_v1_1 {
uint32_t digest_size;
};
/* gpu info payload */
struct gpu_info_firmware_v1_0 {
uint32_t gc_num_se;
uint32_t gc_num_cu_per_sh;
uint32_t gc_num_sh_per_se;
uint32_t gc_num_rb_per_se;
uint32_t gc_num_tccs;
uint32_t gc_num_gprs;
uint32_t gc_num_max_gs_thds;
uint32_t gc_gs_table_depth;
uint32_t gc_gsprim_buff_depth;
uint32_t gc_parameter_cache_depth;
uint32_t gc_double_offchip_lds_buffer;
uint32_t gc_wave_size;
uint32_t gc_max_waves_per_simd;
uint32_t gc_max_scratch_slots_per_cu;
uint32_t gc_lds_size;
};
/* version_major=1, version_minor=0 */
struct gpu_info_firmware_header_v1_0 {
struct common_firmware_header header;
uint16_t version_major; /* version */
uint16_t version_minor; /* version */
};
/* header is fixed size */
union amdgpu_firmware_header {
struct common_firmware_header common;
......@@ -124,6 +150,7 @@ union amdgpu_firmware_header {
struct rlc_firmware_header_v2_0 rlc_v2_0;
struct sdma_firmware_header_v1_0 sdma;
struct sdma_firmware_header_v1_1 sdma_v1_1;
struct gpu_info_firmware_header_v1_0 gpu_info;
uint8_t raw[0x100];
};
......@@ -184,6 +211,7 @@ void amdgpu_ucode_print_smc_hdr(const struct common_firmware_header *hdr);
void amdgpu_ucode_print_gfx_hdr(const struct common_firmware_header *hdr);
void amdgpu_ucode_print_rlc_hdr(const struct common_firmware_header *hdr);
void amdgpu_ucode_print_sdma_hdr(const struct common_firmware_header *hdr);
void amdgpu_ucode_print_gpu_info_hdr(const struct common_firmware_header *hdr);
int amdgpu_ucode_validate(const struct firmware *fw);
bool amdgpu_ucode_hdr_version(union amdgpu_firmware_header *hdr,
uint16_t hdr_major, uint16_t hdr_minor);
......
......@@ -33,6 +33,8 @@
struct amdgpu_vce {
struct amdgpu_bo *vcpu_bo;
uint64_t gpu_addr;
void *cpu_addr;
void *saved_bo;
unsigned fw_version;
unsigned fb_version;
atomic_t handles[AMDGPU_MAX_VCE_HANDLES];
......
此差异已折叠。
此差异已折叠。
......@@ -52,7 +52,6 @@ struct amdgpu_virt {
uint64_t csa_vmid0_addr;
bool chained_ib_support;
uint32_t reg_val_offs;
struct mutex lock_kiq;
struct mutex lock_reset;
struct amdgpu_irq_src ack_irq;
struct amdgpu_irq_src rcv_irq;
......@@ -97,7 +96,7 @@ void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v);
int amdgpu_virt_request_full_gpu(struct amdgpu_device *adev, bool init);
int amdgpu_virt_release_full_gpu(struct amdgpu_device *adev, bool init);
int amdgpu_virt_reset_gpu(struct amdgpu_device *adev);
int amdgpu_sriov_gpu_reset(struct amdgpu_device *adev, bool voluntary);
int amdgpu_sriov_gpu_reset(struct amdgpu_device *adev, struct amdgpu_job *job);
int amdgpu_virt_alloc_mm_table(struct amdgpu_device *adev);
void amdgpu_virt_free_mm_table(struct amdgpu_device *adev);
......
......@@ -22,7 +22,7 @@
*/
#include <linux/firmware.h>
#include "drmP.h"
#include <drm/drmP.h>
#include "amdgpu.h"
#include "amdgpu_pm.h"
#include "amdgpu_ucode.h"
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册