- 04 3月, 2016 15 次提交
-
-
由 Javier González 提交于
When an I/O finishes, full blocks are moved from the open to the closed list - a lock is taken to protect the list. This happens at the moment in the interrupt context, which is not correct. This patch moves this logic to the block workqueue instead, avoiding holding a spinlock without interrupt save in an interrupt context. Signed-off-by: NJavier González <javier@cnexlabs.com> Fixes: ff0e498b ("lightnvm: manage open and closed blocks sepa...") Signed-off-by: NMatias Bjørling <m@bjorling.me> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Konrad Rzeszutek Wilk 提交于
The processes names are truncated to 17, while we had the length of the process as name 20 - which meant that while we filled it out with various details - the last 3 characters (which had the queue number) never surfaced to the user-space. To simplify this and be able to fit the device name, domain id, and the queue number we remove the 'blkback' from the name. Prior to this patch the device name is "blkback.<domid>.<name>" for example: blkback.8.xvda, blkback.11.hda. With the multiqueue block backend we add "-%d" for the queue. But sadly this is already way past the limit so it gets stripped. Possible solution had been identified by Ian: http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg03516.html " If you are pressed for space then the "xvd" is probably a bit redundant in a string which starts blkbk. The guest may not even call the device xvdN (iirc BSD has another prefix) any how, so having blkback say so seems of limited use anyway. Since this seems to not include a partition number how does this work in the split partition scheme? (i.e. one where the guest is given xvda1 and xvda2 rather than xvda with a partition table) [It will be 'blkback.8.xvda1', and 'blkback.11.xvda2'] Perhaps something derived from one of the schemes in http://xenbits.xen.org/docs/unstable/misc/vbd-interface.txt might be a better fit? After a bit of discussion (see http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg01588.html) we settled on dropping the "blback" part. This will make it possible to have the <domid>.<name>-<queue>: [1.xvda-0] [1.xvda-1] And we enough space to make it go up to: [32100.xvdfg9-5] Acked-by: NRoger Pau Monné <roger.pau@citrix.com> Reported-by: NJan Beulich <jbeulich@suse.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Matias Bjørling 提交于
When the media manager runs in dual or quad plane mode, lightnvm abstracts away plane specific commands. This poses a problem for get bad block table, as it reports bad blocks per plane, making the table either two or four times bigger than expected. Fold the bad block list before returning. Signed-off-by: NMatias Bjørling <m@bjorling.me> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Alan 提交于
Instead of checking a constant 0 actually check the space available. Even better remember to allow for the header and also check the right amount of space is needed. Signed-off-by: NAlan Cox <alan@linux.intel.com> Signed-off-by: NMatias Bjørling <m@bjorling.me> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jan Beulich 提交于
There's no reason to defer this until the connect phase, and in fact there are frontend implementations expecting this to be available earlier. Move it into the probe function. Acked-by: NRoger Pau Monné <roger.pau@citrix.com> Signed-off-by: NJan Beulich <jbeulich@suse.com> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Jan Beulich 提交于
"max" is rather ambiguous and carries pretty little meaning, the more that there are also "max_queues" and "max_ring_page_order". Make this "max_indirect_segments" instead, and at once change the type from int to uint (to match the respective variable's type). Acked-by: NRoger Pau Monné <roger.pau@citrix.com> Signed-off-by: NJan Beulich <jbeulich@suse.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Asai Thambi SP 提交于
Fail all pending requests after surprise removal of a drive. Signed-off-by: NVignesh Gunasekaran <vgunasekaran@micron.com> Signed-off-by: NSelvan Mani <smani@micron.com> Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com> Cc: stable@vger.kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Asai Thambi SP 提交于
Added timeout handler. Replaced blk_mq_end_request() with blk_mq_complete_request() to avoid double completion of a request. Signed-off-by: NSelvan Mani <smani@micron.com> Signed-off-by: NRajesh Kumar Sambandam <rsambandam@micron.com> Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com> Cc: stable@vger.kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Asai Thambi SP 提交于
Allow device initialization to finish gracefully when it is in FTL rebuild failure state. Also, recover device out of this state after successfully secure erasing it. Signed-off-by: NSelvan Mani <smani@micron.com> Signed-off-by: NVignesh Gunasekaran <vgunasekaran@micron.com> Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com> Cc: stable@vger.kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Asai Thambi SP 提交于
Flush inflight IOs using fsync_bdev() when the device is safely removed. Also, block further IOs in device open function. Signed-off-by: NSelvan Mani <smani@micron.com> Signed-off-by: NRajesh Kumar Sambandam <rsambandam@micron.com> Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com> Cc: stable@vger.kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Asai Thambi SP 提交于
When FTL rebuild is in progress, alloc_disk() initializes the disk but device node will be created by add_disk() only after successful completion of FTL rebuild. So, skip deletion of device node in removal path when FTL rebuild is in progress. Signed-off-by: NSelvan Mani <smani@micron.com> Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com> Cc: stable@vger.kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Asai Thambi SP 提交于
Prevent standby immediate command from being issued in remove, suspend and shutdown paths, while drive is in FTL rebuild process. Signed-off-by: NSelvan Mani <smani@micron.com> Signed-off-by: NVignesh Gunasekaran <vgunasekaran@micron.com> Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com> Cc: stable@vger.kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Asai Thambi SP 提交于
Print exact time when an internal command is interrupted. Signed-off-by: NSelvan Mani <smani@micron.com> Signed-off-by: NRajesh Kumar Sambandam <rsambandam@micron.com> Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com> Cc: stable@vger.kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Asai Thambi SP 提交于
Remove setting and clearing MTIP_PF_EH_ACTIVE_BIT flag in mtip_handle_tfe() as they are redundant. Also avoid waking up service thread from mtip_handle_tfe() because it is already woken up in case of taskfile error. Signed-off-by: NSelvan Mani <smani@micron.com> Signed-off-by: NRajesh Kumar Sambandam <rsambandam@micron.com> Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com> Cc: stable@vger.kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Asai Thambi SP 提交于
Service thread does not detect the need for taskfile error hanlding. Fixed the flag condition to process taskfile error. Signed-off-by: NSelvan Mani <smani@micron.com> Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com> Cc: stable@vger.kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 03 3月, 2016 1 次提交
-
-
git://git.pengutronix.de/git/mpa/linux-nbd由 Jens Axboe 提交于
NBD for 4.6 Markus writes: This pull request contains 7 patches for 4.6. Patch 1 fixes some unnecessarily complicated code I introduced some versions ago for debugfs. Patch 2 removes the criticised signal usage within NBD to kill the NBD threads after a timeout. This code was used for the last years and is now replaced by simply killing the tcp connection. Patches 3-6 are some smaller cleanups. Patch 7 uevents for the userspace. This way udev/systemd can react on connected NBD devices.
-
- 01 3月, 2016 1 次提交
-
-
由 Ming Lin 提交于
For NVMe over Fabrics, the cntlid will be used by systemd/udev to create link to the device, for example, /dev/disk/by-path/<fabrics-info>-<cntlid>-<namespace> -> /dev/nvme0n1 Signed-off-by: NMing Lin <ming.l@ssi.samsung.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 29 2月, 2016 4 次提交
-
-
由 Christoph Hellwig 提交于
Both LighNVM and NVMe over Fabrics need to look at more than just the status and result field. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMatias Bj?rling <m@bjorling.me> Reviewed-by: NJay Freyensee <james.p.freyensee@intel.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Signed-off-by: NSagi Grimberg <sagig@mellanox.com> Reviewed-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
The only work left in the kthread is the periodic health check for each controller. There is no need to run this from process context or keep a thread context around for it, so replace it with a simpler timer. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
There is no reason to do unconditional polling of CQs per the NVMe spec. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Use a dedicated work item to submit async event requests instead of the global kthread. This simplifies the code and reduces the latencies to resubmit a request once an even notification happened. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 15 2月, 2016 1 次提交
-
-
由 Markus Pargmann 提交于
The userspace needs to know when nbd devices are ready for use. Currently no events are created for the userspace which doesn't work for systemd. See the discussion here: https://github.com/systemd/systemd/pull/358 This patch uses a central point to setup the nbd-internal sizes. A ioctl to set a size does not lead to a visible size change. The size of the block device will be kept at 0 until nbd is connected. As soon as it connects, the size will be changed to the real value and a uevent is created. When disconnecting, the blockdevice is set to 0 size and another uevent is generated. Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
-
- 11 2月, 2016 4 次提交
-
-
由 Ming Lin 提交于
NVMe over Fabrics drivers are going to reuse the core, so splits nvme.ko into 2 modules: nvme-core.ko: the core part nvme.ko: the PCI driver Export symbols from nvme-core.ko. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lin <ming.l@ssi.samsung.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Ming Lin 提交于
Split dev_list_lock into one in the core and one in the PCI driver. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Signed-off-by: NMing Lin <ming.l@ssi.samsung.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Ming Lin 提交于
These variables are used by PCI driver and will also be used in the forthcoming NVMe over Fabrics drivers. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Signed-off-by: NMing Lin <ming.l@ssi.samsung.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Sagi Grimberg 提交于
We don't want to be able to unload the fabric driver when we have openened referenced to our namespaces. Thus, for each nvme_open we take a reference on the fabric driver and put it in nvme_release. This behavior is consistent with the scsi model. This resolves the panic when unloading a fabric module with mpath holders. Signed-off-by: NSagi Grimberg <sagig@mellanox.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NIan Bakshan <ianb@mellanox.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NMing Lin <ming.l@ssi.samsung.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 10 2月, 2016 4 次提交
-
-
由 Sagi Grimberg 提交于
Having the ctrl name "nvmeX" seems much more friendly than the underlying device name. Also, with other nvme transports such as the soon to come nvme-loop we don't have an underlying device so it doesn't makes sense to make up one. In order to help matching an instance name to a pci function, we add a info print in nvme_probe. Signed-off-by: NSagi Grimberg <sagig@mellanox.com> Acked-by: NKeith Busch <keith.busch@intel.com> Manually fixed up the hunk in nvme_cancel_queue_ios(). Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Pass the right private data to device_create_with_groups from the beginning, and remove the superflous call to dev_set_drvdata. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJon Derrick <jonathan.derrick@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Keith Busch 提交于
This notifies blk-mq when the tag set contains a different number of queues prior to freeing unused ones that the request queue points to. Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Keith Busch 提交于
The hardware's provided queue count may change at runtime with resource provisioning. This patch allows a block driver to alter the number of h/w queues available when its resource count changes. The main part is a new blk-mq API to request a new number of h/w queues for a given live tag set. The new API freezes all queues using that set, then adjusts the allocated count prior to remapping these to CPUs. The bulk of the rest just shifts where h/w contexts and all their artifacts are allocated and freed. The number of max h/w contexts is capped to the number of possible cpus since there is no use for more than that. As such, all pre-allocated memory for pointers need to account for the max possible rather than the initial number of queues. A side effect of this is that the blk-mq will proceed successfully as long as it can allocate at least one h/w context. Previously it would fail request queue initialization if less than the requested number was allocated. Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NJon Derrick <jonathan.derrick@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 05 2月, 2016 9 次提交
-
-
由 Dan Streetman 提交于
Make the "Attempted send on closed socket" error messages generated in nbd_request_handler() ratelimited. When the nbd socket is shutdown, the nbd_request_handler() function emits an error message for every request remaining in its queue. If the queue is large, this will spam a large amount of messages to the log. There's no need for a separate error message for each request, so this patch ratelimits it. In the specific case this was found, the system was virtual and the error messages were logged to the serial port, which overwhelmed it. Fixes: 4d48a542 ("nbd: fix I/O hang on disconnected nbds") Signed-off-by: NDan Streetman <dan.streetman@canonical.com> Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
-
由 Markus Pargmann 提交于
nbd changes properties of the blockdevice depending on flags that were received. This patch moves this flag parsing into a separate function nbd_parse_flags(). Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
-
由 Markus Pargmann 提交于
Group all variables that are reset after a disconnect into reset functions. This patch adds two of these functions, nbd_reset() and nbd_bdev_reset(). Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
-
由 Markus Pargmann 提交于
It may be useful to know in the client that a connection timed out. The current code returns success for a timeout. This patch reports the error code -ETIMEDOUT for a timeout. Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
-
由 Markus Pargmann 提交于
As discussed on the mailing list, the usage of signals for timeout handling has a lot of potential issues. The nbd driver used for some time signals for timeouts. These signals where able to get the threads out of the blocking socket operations. This patch removes all signal usage and uses a socket shutdown instead. The socket descriptor itself is cleared later when the whole nbd device is closed. The tasks_lock is removed as we do not depend on this anymore. Instead a new lock for the socket is introduced so we can safely work with the socket in the timeout handler outside of the two main threads. Cc: Oleg Nesterov <oleg@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Jan Kara 提交于
Currently we don't allow sync workload of one cgroup to preempt sync workload of any other cgroup. This is because we want to achieve service separation between cgroups. However in cases where cgroup preempting is ancestor of the current cgroup, there is no need of separation and idling introduces unnecessary overhead. This hurts for example the case when workload is isolated within a cgroup but journalling threads are in root cgroup. Simple way to demostrate the issue is using: dbench4 -c /usr/share/dbench4/client.txt -t 10 -D /mnt 1 on ext4 filesystem on plain SATA drive (mounted with barrier=0 to make difference more visible). When all processes are in the root cgroup, reported throughput is 153.132 MB/sec. When dbench process gets its own blkio cgroup, reported throughput drops to 26.1006 MB/sec. Fix the problem by making check in cfq_should_preempt() more benevolent and allow preemption by ancestor cgroup. This improves the throughput reported by dbench4 to 48.9106 MB/sec. Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJan Kara <jack@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jan Kara 提交于
The original idea with preemption of sync noidle queues (introduced in commit 718eee05 "cfq-iosched: fairness for sync no-idle queues") was that we service all sync noidle queues together, we don't idle on any of the queues individually and we idle only if there is no sync noidle queue to be served. This intention also matches the original test: if (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD && new_cfqq->service_tree == cfqq->service_tree) return true; However since at that time cfqq->service_tree was not set for idling queues, this test was unreliable and was replaced in commit e4a22919 "cfq-iosched: fix no-idle preemption logic" by: if (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD && cfqq_type(new_cfqq) == SYNC_NOIDLE_WORKLOAD && new_cfqq->service_tree->count == 1) return true; That was a reliable test but was actually doing something different - now we preempt sync noidle queue only if the new queue is the only one busy in the service tree. These days cfq queue is kept in service tree even if it is idling and thus the original check would be safe again. But since we actually check that cfq queues are in the same cgroup, of the same priority class and workload type (sync noidle), we know that new_cfqq is fine to preempt cfqq. So just remove the service tree check. Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJan Kara <jack@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jan Kara 提交于
Move check for preemption by rt class up. There is no functional change but it makes arguing about conditions simpler since we can be sure both cfq queues are from the same ioprio class. Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJan Kara <jack@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jan Kara 提交于
There is no point in idling on a cfq group if the only cfq queue that is there has too big thinktime. Signed-off-by: NJan Kara <jack@suse.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 03 2月, 2016 1 次提交
-
-
由 Markus Pargmann 提交于
Static checker complains about the implemented error handling. It is indeed wrong. We don't care about the return values of created debugfs files. We only have to check the return values of created dirs for NULL pointer. If we use a null pointer as parent directory for files, this may lead to debugfs files in wrong places. Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
-