- 15 7月, 2022 1 次提交
-
-
由 Jinhao Fan 提交于
Implement Doorbel Buffer Config command (Section 5.7 in NVMe Spec 1.3) and Shadow Doorbel buffer & EventIdx buffer handling logic (Section 7.13 in NVMe Spec 1.3). For queues created before the Doorbell Buffer Config command, the nvme_dbbuf_config function tries to associate each existing SQ and CQ with its Shadow Doorbel buffer and EventIdx buffer address. Queues created after the Doorbell Buffer Config command will have the doorbell buffers associated with them when they are initialized. In nvme_process_sq and nvme_post_cqe, proactively check for Shadow Doorbell buffer changes instead of wait for doorbell register changes. This reduces the number of MMIOs. In nvme_process_db(), update the shadow doorbell buffer value with the doorbell register value if it is the admin queue. This is a hack since hosts like Linux NVMe driver and SPDK do not use shadow doorbell buffer for the admin queue. Copying the doorbell register value to the shadow doorbell buffer allows us to support these hosts as well as spec-compliant hosts that use shadow doorbell buffer for the admin queue. Signed-off-by: NJinhao Fan <fanjinhao21s@ict.ac.cn> Reviewed-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> [k.jensen: rebased] Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
- 24 6月, 2022 8 次提交
-
-
由 Łukasz Gieryk 提交于
With the new command one can: - assign flexible resources (queues, interrupts) to primary and secondary controllers, - toggle the online/offline state of given controller. Signed-off-by: NŁukasz Gieryk <lukasz.gieryk@linux.intel.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NKlaus Jensen <k.jensen@samsung.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Łukasz Gieryk 提交于
With four new properties: - sriov_v{i,q}_flexible, - sriov_max_v{i,q}_per_vf, one can configure the number of available flexible resources, as well as the limits. The primary and secondary controller capability structures are initialized accordingly. Since the number of available queues (interrupts) now varies between VF/PF, BAR size calculation is also adjusted. Signed-off-by: NŁukasz Gieryk <lukasz.gieryk@linux.intel.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NKlaus Jensen <k.jensen@samsung.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Łukasz Gieryk 提交于
The n->reg_size parameter unnecessarily splits the BAR0 size calculation in two phases; removed to simplify the code. With all the calculations done in one place, it seems the pow2ceil, applied originally to reg_size, is unnecessary. The rounding should happen as the last step, when BAR size includes Nvme registers, queue registers, and MSIX-related space. Finally, the size of the mmio memory region is extended to cover the 1st 4KiB padding (see the map below). Access to this range is handled as interaction with a non-existing queue and generates an error trace, so actually nothing changes, while the reg_size variable is no longer needed. -------------------- | BAR0 | -------------------- [Nvme Registers ] [Queues ] [power-of-2 padding] - removed in this patch [4KiB padding (1) ] [MSIX TABLE ] [4KiB padding (2) ] [MSIX PBA ] [power-of-2 padding] Signed-off-by: NŁukasz Gieryk <lukasz.gieryk@linux.intel.com> Reviewed-by: NKlaus Jensen <k.jensen@samsung.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Łukasz Gieryk 提交于
The NVMe device defines two properties: max_ioqpairs, msix_qsize. Having them as constants is problematic for SR-IOV support. SR-IOV introduces virtual resources (queues, interrupts) that can be assigned to PF and its dependent VFs. Each device, following a reset, should work with the configured number of queues. A single constant is no longer sufficient to hold the whole state. This patch tries to solve the problem by introducing additional variables in NvmeCtrl’s state. The variables for, e.g., managing queues are therefore organized as: - n->params.max_ioqpairs – no changes, constant set by the user - n->(mutable_state) – (not a part of this patch) user-configurable, specifies number of queues available _after_ reset - n->conf_ioqpairs - (new) used in all the places instead of the ‘old’ n->params.max_ioqpairs; initialized in realize() and updated during reset() to reflect user’s changes to the mutable state Since the number of available i/o queues and interrupts can change in runtime, buffers for sq/cqs and the MSIX-related structures are allocated big enough to handle the limits, to completely avoid the complicated reallocation. A helper function (nvme_update_msixcap_ts) updates the corresponding capability register, to signal configuration changes. Signed-off-by: NŁukasz Gieryk <lukasz.gieryk@linux.intel.com> Reviewed-by: NKlaus Jensen <k.jensen@samsung.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Łukasz Gieryk 提交于
This patch implements the Function Level Reset, a feature currently not implemented for the Nvme device, while listed as a mandatory ("shall") in the 1.4 spec. The implementation reuses FLR-related building blocks defined for the pci-bridge module, and follows the same logic: - FLR capability is advertised in the PCIE config, - custom pci_write_config callback detects a write to the trigger register and performs the PCI reset, - which, eventually, calls the custom dc->reset handler. Depending on reset type, parts of the state should (or should not) be cleared. To distinguish the type of reset, an additional parameter is passed to the reset function. This patch also enables advertisement of the Power Management PCI capability. The main reason behind it is to announce the no_soft_reset=1 bit, to signal SR-IOV support where each VF can be reset individually. The implementation purposedly ignores writes to the PMCS.PS register, as even such naïve behavior is enough to correctly handle the D3->D0 transition. It’s worth to note, that the power state transition back to to D3, with all the corresponding side effects, wasn't and stil isn't handled properly. Signed-off-by: NŁukasz Gieryk <lukasz.gieryk@linux.intel.com> Reviewed-by: NKlaus Jensen <k.jensen@samsung.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Lukasz Maniak 提交于
Introduce handling for Secondary Controller List (Identify command with CNS value of 15h). Secondary controller ids are unique in the subsystem, hence they are reserved by it upon initialization of the primary controller to the number of sriov_max_vfs. ID reservation requires the addition of an intermediate controller slot state, so the reserved controller has the address 0xFFFF. A secondary controller is in the reserved state when it has no virtual function assigned, but its primary controller is realized. Secondary controller reservations are released to NULL when its primary controller is unregistered. Signed-off-by: NLukasz Maniak <lukasz.maniak@linux.intel.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NKlaus Jensen <k.jensen@samsung.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Lukasz Maniak 提交于
Implementation of Primary Controller Capabilities data structure (Identify command with CNS value of 14h). Currently, the command returns only ID of a primary controller. Handling of remaining fields are added in subsequent patches implementing virtualization enhancements. Signed-off-by: NLukasz Maniak <lukasz.maniak@linux.intel.com> Reviewed-by: NKlaus Jensen <k.jensen@samsung.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Lukasz Maniak 提交于
This patch implements initial support for Single Root I/O Virtualization on an NVMe device. Essentially, it allows to define the maximum number of virtual functions supported by the NVMe controller via sriov_max_vfs parameter. Passing a non-zero value to sriov_max_vfs triggers reporting of SR-IOV capability by a physical controller and ARI capability by both the physical and virtual function devices. NVMe controllers created via virtual functions mirror functionally the physical controller, which may not entirely be the case, thus consideration would be needed on the way to limit the capabilities of the VF. NVMe subsystem is required for the use of SR-IOV. Signed-off-by: NLukasz Maniak <lukasz.maniak@linux.intel.com> Reviewed-by: NKlaus Jensen <k.jensen@samsung.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
- 04 6月, 2022 1 次提交
-
-
由 Klaus Jensen 提交于
The Identify Controller Serial Number (SN) is the serial number for the NVM subsystem and must be the same across all controller in the NVM subsystem. Enforce this. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
- 11 5月, 2022 1 次提交
-
-
由 Markus Armbruster 提交于
Header guard symbols should match their file name to make guard collisions less likely. Cleaned up with scripts/clean-header-guards.pl, followed by some renaming of new guard symbols picked by the script to better ones. Signed-off-by: NMarkus Armbruster <armbru@redhat.com> Message-Id: <20220506134911.2856099-2-armbru@redhat.com> Reviewed-by: NRichard Henderson <richard.henderson@linaro.org> [Change to generated file ebpf/rss.bpf.skeleton.h backed out]
-
- 03 3月, 2022 4 次提交
-
-
由 Naveen Nagar 提交于
This adds support for one possible new protection information format introduced in TP4068 (and integrated in NVMe 2.0): the 64-bit CRC guard and 48-bit reference tag. This version does not support storage tags. Like the CRC16 support already present, this uses a software implementation of CRC64 (so it is naturally pretty slow). But its good enough for verification purposes. This may go nicely hand-in-hand with the support that Keith submitted for the Linux kernel[1]. [1]: https://lore.kernel.org/linux-nvme/20220126165214.GA1782352@dhcp-10-100-145-180.wdc.com/T/Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NNaveen Nagar <naveen.n1@samsung.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Naveen Nagar 提交于
Add support for up to 64 LBA formats through the LBAFEE field of the Host Behavior Support feature. Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NNaveen Nagar <naveen.n1@samsung.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Naveen Nagar 提交于
Add support for getting and setting the Host Behavior Support feature. Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NNaveen Nagar <naveen.n1@samsung.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Klaus Jensen 提交于
Move dif/pi data structures and inlines to dif.h. Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
- 14 2月, 2022 2 次提交
-
-
由 Klaus Jensen 提交于
Add support for TP 4076 ("Zoned Random Write Area"), v2021.08.23 ("Ratified"). This adds three new namespace parameters: "zoned.numzrwa" (number of zrwa resources, i.e. number of zones that can have a zrwa), "zoned.zrwas" (zrwa size in LBAs), "zoned.zrwafg" (granularity in LBAs for flushes). Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Philippe Mathieu-Daudé 提交于
These buffers can be anything, not an array of chars, so use the 'void *' type for them. Signed-off-by: NPhilippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: NKlaus Jensen <k.jensen@samsung.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
- 27 7月, 2021 3 次提交
-
-
由 Klaus Jensen 提交于
Prior to this patch the nvme-ns devices are always children of the NvmeBus owned by the NvmeCtrl. This causes the namespaces to be unrealized when the parent device is removed. However, when subsystems are involved, this is not what we want since the namespaces may be attached to other controllers as well. This patch adds an additional NvmeBus on the subsystem device. When nvme-ns devices are realized, if the parent controller device is linked to a subsystem, the parent bus is set to the subsystem one instead. This makes sure that namespaces are kept alive and not unrealized. Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Klaus Jensen 提交于
Make sure the controller is unregistered from the subsystem when device is removed. Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Klaus Jensen 提交于
The nvme_ns_setup and nvme_ns_check_constraints should not depend on the controller state. Refactor and remove it. Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
- 29 6月, 2021 7 次提交
-
-
由 Klaus Jensen 提交于
Jakub noticed[1] that, when using pin-based interrupts, the device will unconditionally deasssert when any CQEs are acknowledged. However, the pin should not be deasserted if other completion queues still holds unacknowledged CQEs. The bug is an artifact of commit ca247d35 ("hw/block/nvme: fix pin-based interrupt behavior") which fixed one bug but introduced another. This is the third time someone tries to fix pin-based interrupts (see commit 5e9aa92e ("hw/block: Fix pin-based interrupt behaviour of NVMe"))... Third time's the charm, so fix it, again, by keeping track of how many CQs have unacknowledged CQEs and only deassert when all are cleared. [1]: <20210610114624.304681-1-jakub.jermar@kernkonzept.com> Cc: qemu-stable@nongnu.org Fixes: ca247d35 ("hw/block/nvme: fix pin-based interrupt behavior") Reported-by: NJakub Jermář <jakub.jermar@kernkonzept.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
由 Klaus Jensen 提交于
The nvme_check_prinfo() and nvme_dif_check() functions operate on the 16 bit "control" member of the NvmeCmd. These functions do not otherwise operate on an NvmeCmd or an NvmeRequest, so change them to expect the actual 4 bit PRINFO field and add constants that work on this field as well. Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
由 Klaus Jensen 提交于
Prepare nvme_dif_pract_generate_dif() and nvme_dif_check() to be callable in smaller increments by making the reftag a pointer parameter updated by the function. Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
由 Klaus Jensen 提交于
Prior to this patch, a broadcast flush would result in submitting multiple "fire and forget" aios (no reference saved to the aiocbs returned from the blk_aio_flush calls). Fix this by issuing the flushes one after another. Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
由 Heinrich Schuchardt 提交于
On machines with version > 6.0 replace a missing EUI-64 by a generated value. Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Reviewed-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Heinrich Schuchardt 提交于
The EUI-64 field is the only identifier for NVMe namespaces in UEFI device paths. Add a new namespace property "eui64", that provides the user the option to specify the EUI-64. Signed-off-by: NHeinrich Schuchardt <xypron.glpk@gmx.de> Acked-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKlaus Jensen <k.jensen@samsung.com>
-
由 Niklas Cassel 提交于
In the Zoned Namespace Command Set Specification, chapter 2.5.1 Managing resources "The controller may transition zones in the ZSIO:Implicitly Opened state to the ZSC:Closed state for resource management purposes." The word may in this sentence means that automatically transitioning an implicitly opened zone to closed is completely optional. Add a new parameter so that the user can control if this automatic transitioning should be performed or not. Being able to control this can help with verifying that e.g. a user-space program behaves properly even without this optional ZNS feature. The default value is set to true, in order to not change the existing behavior. Signed-off-by: NNiklas Cassel <niklas.cassel@wdc.com> [k.jensen: moved parameter to controller] Signed-off-by: NKlaus Jensen <k.jensen@samsung.com>
-
- 17 5月, 2021 8 次提交
-
-
由 Klaus Jensen 提交于
With the introduction of the nvme-subsystem device we are really cluttering up the hw/block directory. As suggested by Philippe previously, move the nvme emulation to hw/nvme. Suggested-by: NPhilippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
由 Klaus Jensen 提交于
The NvmeCtrl num_namespaces member is just an indirection for the NVME_MAX_NAMESPACES constant. Remove the indirection. Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
由 Klaus Jensen 提交于
Streamline namespace array indexing such that both the subsystem and controller namespaces arrays are 1-indexed. Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
由 Klaus Jensen 提交于
Add an nvme_moff() helper. Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
由 Klaus Jensen 提交于
There is no need to look up the lba size and metadata size in the LBA Format structure everytime we want to use it. And we use it a lot. Cache the values in the NvmeNamespace and update them if the namespace is formatted. Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
由 Klaus Jensen 提交于
The inline nvme_ns_status() helper only has a single call site. Remove it from the header file and inline it for real. Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
由 Klaus Jensen 提交于
Remove non-shared defines from the shared header. Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
由 Klaus Jensen 提交于
In preparation for moving the nvme device into its own subtree, merge the header files into one. Also add missing copyright notice and add list of authors with substantial contributions. Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
- 07 4月, 2021 3 次提交
-
-
由 Klaus Jensen 提交于
Prior to this patch, if a private nvme-ns device (that is, a namespace that is not linked to a subsystem) is wired up to an nvme-subsys linked nvme controller device, the device fails to verify that the namespace id is unique within the subsystem. NVM Express v1.4b, Section 6.1.6 ("NSID and Namespace Usage") states that because the device supports Namespace Management, "NSIDs *shall* be unique within the NVM subsystem". Additionally, prior to this patch, private namespaces are not known to the subsystem and the namespace is considered exclusive to the controller with which it is initially wired up to. However, this is not the definition of a private namespace; per Section 1.6.33 ("private namespace"), a private namespace is just a namespace that does not support multipath I/O or namespace sharing, which means "that it is only able to be attached to one controller at a time". Fix this by always allocating namespaces in the subsystem (if one is linked to the controller), regardless of the shared/private status of the namespace. Whether or not the namespace is shareable is controlled by a new `shared` nvme-ns parameter. Finally, this fix allows the nvme-ns `subsys` parameter to be removed, since the `shared` parameter now serves the purpose of attaching the namespace to all controllers in the subsystem upon device realization. It is invalid to have an nvme-ns namespace device with a linked subsystem without the parent nvme controller device also being linked to one and since the nvme-ns devices will unconditionally be "attached" (in QEMU terms that is) to an nvme controller device through an NvmeBus, the nvme-ns namespace device can always get a reference to the subsystem of the controller it is explicitly (using 'bus=' parameter) or implicitly attaching to. Fixes: e5707685 ("hw/block/nvme: support for shared namespace in subsystem") Cc: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NGollu Appalanaidu <anaidu.gollu@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com>
-
由 Klaus Jensen 提交于
Remove the unused BlockConf from the controller structure and remove the noop constraint checking. Device works just fine with both legacy drive parameter namespace and nvme-ns namespace definitions. Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NGollu Appalanaidu <anaidu.gollu@samsung.com>
-
由 Klaus Jensen 提交于
Add the missing nvme_adm_opc_str entry for the Namespace Attachment command. Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NGollu Appalanaidu <anaidu.gollu@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
- 18 3月, 2021 2 次提交
-
-
由 Minwoo Im 提交于
Format NVM admin command can make a namespace or namespaces to be with different LBA size and metadata size with protection information types. This patch introduces Format NVM command with LBA format, Metadata, and Protection Information for the device. The secure erase operation things and support for formatting zoned namespaces are yet to be added. The parameter checks inside of this patch has been referred from Keith's old branch. Signed-off-by: NMinwoo Im <minwoo.im@samsung.com> [anaidu.gollu: rebased on e2e] Signed-off-by: NGollu Appalanaidu <anaidu.gollu@samsung.com> [k.jensen: rebased for reworked aio tracking] Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
由 Klaus Jensen 提交于
Verify is not subject to MDTS, so a single Verify command may result in excessive amounts of allocated memory. Impose a limit on the data size by adding support for TP 4040 ("Non-MDTS Command Size Limits"). Signed-off-by: NKlaus Jensen <k.jensen@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-