- 19 6月, 2017 10 次提交
-
-
由 Ming Lei 提交于
It is required that no dispatch can happen any more once blk_mq_quiesce_queue() returns, and we don't have such requirement on APIs of stopping queue. But blk_mq_quiesce_queue() still may not block/drain dispatch in the the case of BLK_MQ_S_START_ON_RUN, so use the new introduced flag of QUEUE_FLAG_QUIESCED and evaluate it inside RCU read-side critical sections for fixing this issue. Also blk_mq_quiesce_queue() is implemented via stopping queue, which limits its uses, and easy to cause race, because any queue restart in other paths may break blk_mq_quiesce_queue(). With the introduced flag of QUEUE_FLAG_QUIESCED, we don't need to depend on stopping queue for quiescing any more. Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
blk_mq_start_stopped_hw_queues() is used implictly as counterpart of blk_mq_quiesce_queue() for unquiescing queue, so we introduce blk_mq_unquiesce_queue() and make it as counterpart of blk_mq_quiesce_queue() explicitly. This function is for improving the current quiescing mechanism in the following patches. Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
This patch introduces blk_mq_quiesce_queue_nowait() so that we can workaround mpt3sas for quiescing its queue. Once mpt3sas is fixed, we can remove this helper. Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
We usually put blk_mq_*() into include/linux/blk-mq.h, so move this API into there. Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 NeilBrown 提交于
bio_clone() is no longer used. Only bio_clone_bioset() or bio_clone_fast(). This is for the best, as bio_clone() used fs_bio_set, and filesystems are unlikely to want to use bio_clone(). So remove bio_clone() and all references. This includes a fix to some incorrect documentation. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 NeilBrown 提交于
This patch converts bioset_create() to not create a workqueue by default, so alloctions will never trigger punt_bios_to_rescuer(). It also introduces a new flag BIOSET_NEED_RESCUER which tells bioset_create() to preserve the old behavior. All callers of bioset_create() that are inside block device drivers, are given the BIOSET_NEED_RESCUER flag. biosets used by filesystems or other top-level users do not need rescuing as the bio can never be queued behind other bios. This includes fs_bio_set, blkdev_dio_pool, btrfs_bioset, xfs_ioend_bioset, and one allocated by target_core_iblock.c. biosets used by md/raid do not need rescuing as their usage was recently audited and revised to never risk deadlock. It is hoped that most, if not all, of the remaining biosets can end up being the non-rescued version. Reviewed-by: NChristoph Hellwig <hch@lst.de> Credit-to: Ming Lei <ming.lei@redhat.com> (minor fixes) Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 NeilBrown 提交于
"flags" arguments are often seen as good API design as they allow easy extensibility. bioset_create_nobvec() is implemented internally as a variation in flags passed to __bioset_create(). To support future extension, make the internal structure part of the API. i.e. add a 'flags' argument to bioset_create() and discard bioset_create_nobvec(). Note that the bio_split allocations in drivers/md/raid* do not need the bvec mempool - they should have used bioset_create_nobvec(). Suggested-by: NChristoph Hellwig <hch@infradead.org> Reviewed-by: NChristoph Hellwig <hch@infradead.org> Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 NeilBrown 提交于
blk_queue_split() is always called with the last arg being q->bio_split, where 'q' is the first arg. Also blk_queue_split() sometimes uses the passed-in 'bs' and sometimes uses q->bio_split. This is inconsistent and unnecessary. Remove the last arg and always use q->bio_split inside blk_queue_split() Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Credit-to: Javier González <jg@lightnvm.io> (Noticed that lightnvm was missed) Reviewed-by: NJavier González <javier@cnexlabs.com> Tested-by: NJavier González <javier@cnexlabs.com> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
This patch makes sure we always allocate requests in the core blk-mq code and use a common prepare_request method to initialize them for both mq I/O schedulers. For Kyber and additional limit_depth method is added that is called before allocating the request. Also because none of the intializations can really fail the new method does not return an error - instead the bfq finish method is hardened to deal with the no-IOC case. Last but not least this removes the abuse of RQF_QUEUE by the blk-mq scheduling code as RQF_ELFPRIV is all that is needed now. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
No need to have two different callouts of bfq vs kyber. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 16 6月, 2017 1 次提交
-
-
由 Scott Bauer 提交于
The NVMe 1.3 spec introduces Namespace Optimal IO Boundaries (NOIOB), which standardizes the stripe mechanism we currently have quirks for. This patch implements the necessary logic to handle this new feature. Signed-off-by: NScott Bauer <scott.bauer@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 15 6月, 2017 6 次提交
-
-
由 Guan Junxiong 提交于
Add the new to NVMe 1.3 fields EDSTT, DSTO, FWUG, HCTMA, MNTMT, MXTMT, and SANICAP into the idenfity controller data structure. Signed-off-by: NGuan Junxiong <guanjunxiong@huawei.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Johannes Thumshirn 提交于
Allow overriding the announced NVMe Version of a via configfs. This is particularly helpful when debugging new features for the host or target side without bumping the hard coded version (as the target might not be fully compliant to the announced version yet). Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NGuan Junxiong <guanjunxiong@huawei.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Johannes Thumshirn 提交于
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Johannes Thumshirn 提交于
Use NVME_IDENTIFY_DATA_SIZE define instead of hard coding the magic 4096 value. Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NHannes Reinecke <hare@suse.com> [hch: converted three more users] Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Johannes Thumshirn 提交于
The sg_zero_buffer() helper is used to zero fill an area in a SG list. Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> [hch: renamed to sg_zero_buffer] Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Xu Yu 提交于
The existing driver initially maps 8192 bytes of BAR0 which is intended to cover doorbells of admin SQ and CQ. However, if a large stride, e.g. 10, is used, the doorbell of admin CQ will be out of 8192 bytes. Consequently, a page fault will be raised when the admin CQ doorbell is accessed in nvme_configure_admin_queue(). This patch fixes this issue by remapping BAR0 before accessing admin CQ doorbell if the initial mapping is not enough. Signed-off-by: NXu Yu <yu.a.xu@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 13 6月, 2017 2 次提交
-
-
由 Arnav Dawn 提交于
Signed-off-by: NArnav Dawn <a.dawn@samsung.com> [hch: split from a larger patch, new changelog] Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
-
- 12 6月, 2017 1 次提交
-
-
由 Linus Torvalds 提交于
Commit abb2ea7d ("compiler, clang: suppress warning for unused static inline functions") just caused more warnings due to re-defining the 'inline' macro. So undef it before re-defining it, and also add the 'notrace' attribute like the gcc version that this is overriding does. Maybe this makes clang happier. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 6月, 2017 8 次提交
-
-
由 Mike Snitzer 提交于
Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Replace bi_error with a new bi_status to allow for a clear conversion. Note that device mapper overloaded bi_error with a private value, which we'll have to keep arround at least for now and thus propagate to a proper blk_status_t value. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Use the same values for use for request completion errors as the return value from ->queue_rq. BLK_STS_RESOURCE is special cased to cause a requeue, and all the others are completed as-is. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Currently we use nornal Linux errno values in the block layer, and while we accept any error a few have overloaded magic meanings. This patch instead introduces a new blk_status_t value that holds block layer specific status codes and explicitly explains their meaning. Helpers to convert from and to the previous special meanings are provided for now, but I suspect we want to get rid of them in the long run - those drivers that have a errno input (e.g. networking) usually get errnos that don't know about the special block layer overloads, and similarly returning them to userspace will usually return somethings that strictly speaking isn't correct for file system operations, but that's left as an exercise for later. For now the set of errors is a very limited set that closely corresponds to the previous overloaded errno values, but there is some low hanging fruite to improve it. blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse typechecking, so that we can easily catch places passing the wrong values. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Turn the error paramter into a pointer so that target drivers can change the value, and make sure only DM_ENDIO_* values are returned from the methods. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Eric Biggers 提交于
Signed-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NStephan Mueller <smueller@chronox.de> Signed-off-by: NJames Morris <james.l.morris@oracle.com>
-
由 Eric Biggers 提交于
While a 'struct key' itself normally does not contain sensitive information, Documentation/security/keys.txt actually encourages this: "Having a payload is not required; and the payload can, in fact, just be a value stored in the struct key itself." In case someone has taken this advice, or will take this advice in the future, zero the key structure before freeing it. We might as well, and as a bonus this could make it a bit more difficult for an adversary to determine which keys have recently been in use. This is safe because the key_jar cache does not use a constructor. Signed-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NJames Morris <james.l.morris@oracle.com>
-
- 08 6月, 2017 3 次提交
-
-
由 Paolo Bonzini 提交于
Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting down a guest running iperf on a VFIO assigned device. This happens because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt context, while a worker thread does the same inside kvm_set_irq(). If the interrupt happens while the worker thread is executing __srcu_read_lock(), updates to the Classic SRCU ->lock_count[] field or the Tree SRCU ->srcu_lock_count[] field can be lost. The docs say you are not supposed to call srcu_read_lock() and srcu_read_unlock() from irq context, but KVM interrupt injection happens from (host) interrupt context and it would be nice if SRCU supported the use case. KVM is using SRCU here not really for the "sleepable" part, but rather due to its IPI-free fast detection of grace periods. It is therefore not desirable to switch back to RCU, which would effectively revert commit 719d93cd ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING", 2014-01-16). However, the docs are overly conservative. You can have an SRCU instance only has users in irq context, and you can mix process and irq context as long as process context users disable interrupts. In addition, __srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and Classic SRCU. For those two implementations, only srcu_read_lock() is unsafe. When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(), in commit 5a41344a ("srcu: Simplify __srcu_read_unlock() via this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments. Therefore it kept __this_cpu_inc(), with preempt_disable/enable in the caller. Tree SRCU however only does one increment, so on most architectures it is more efficient for __srcu_read_lock() to use this_cpu_inc(), and any performance differences appear to be down in the noise. Cc: stable@vger.kernel.org Fixes: 719d93cd ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING") Reported-by: NLinu Cherian <linuc.decode@gmail.com> Suggested-by: NLinu Cherian <linuc.decode@gmail.com> Cc: kvm@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
由 Hannes Reinecke 提交于
When generating bootable VM images certain systems (most notably s390x) require devices with 4k blocksize. This patch implements a new flag 'LO_FLAGS_BLOCKSIZE' which will set the physical blocksize to that of the underlying device, and allow to change the logical blocksize for up to the physical blocksize. Signed-off-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Without this the build will fail for !CONFIG_ACPI builds on x86. Fixes: 94116f81 ("ACPI: Switch to use generic guid_t in acpi_evaluate_dsm()") Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 07 6月, 2017 4 次提交
-
-
由 Andy Shevchenko 提交于
acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16 bytes. Instead we convert them to use guid_t type. At the same time we convert current users. acpi_str_to_uuid() becomes useless after the conversion and it's safe to get rid of it. Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Borislav Petkov <bp@suse.de> Acked-by: NDan Williams <dan.j.williams@intel.com> Cc: Amir Goldstein <amir73il@gmail.com> Reviewed-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Acked-by: NJani Nikula <jani.nikula@intel.com> Cc: Ben Skeggs <bskeggs@redhat.com> Acked-by: NBenjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: NJoerg Roedel <jroedel@suse.de> Acked-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Yisen Zhuang <yisen.zhuang@huawei.com> Acked-by: NBjorn Helgaas <bhelgaas@google.com> Acked-by: NFelipe Balbi <felipe.balbi@linux.intel.com> Acked-by: NMathias Nyman <mathias.nyman@linux.intel.com> Reviewed-by: NHeikki Krogerus <heikki.krogerus@linux.intel.com> Acked-by: NMark Brown <broonie@kernel.org> Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Rafael J. Wysocki 提交于
Revert commit eed4d47e (ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle) as it turned out to be premature and triggered a number of different issues on various systems. That includes, but is not limited to, premature suspend-to-RAM aborts on Dell XPS 13 (9343) reported by Dominik. The issue the commit in question attempted to address is real and will need to be taken care of going forward, but evidently more work is needed for this purpose. Reported-by: NDominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 David Rientjes 提交于
GCC explicitly does not warn for unused static inline functions for -Wunused-function. The manual states: Warn whenever a static function is declared but not defined or a non-inline static function is unused. Clang does warn for static inline functions that are unused. It turns out that suppressing the warnings avoids potentially complex #ifdef directives, which also reduces LOC. Suppress the warning for clang. Signed-off-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric Biggers 提交于
gcc 7.1 reports the following warning: block/elevator.c: In function ‘elv_register’: block/elevator.c:898:5: warning: ‘snprintf’ output may be truncated before the last format character [-Wformat-truncation=] "%s_io_cq", e->elevator_name); ^~~~~~~~~~ block/elevator.c:897:3: note: ‘snprintf’ output between 7 and 22 bytes into a destination of size 21 snprintf(e->icq_cache_name, sizeof(e->icq_cache_name), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "%s_io_cq", e->elevator_name); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The bug is that the name of the icq_cache is 6 characters longer than the elevator name, but only ELV_NAME_MAX + 5 characters were reserved for it --- so in the case of a maximum-length elevator name, the 'q' character in "_io_cq" would be truncated by snprintf(). Fix it by reserving ELV_NAME_MAX + 6 characters instead. Signed-off-by: NEric Biggers <ebiggers@google.com> Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 06 6月, 2017 1 次提交
-
-
由 Andy Shevchenko 提交于
There are new types and helpers that are supposed to be used in new code. As a preparation to get rid of legacy types and API functions do the conversion here. Reviewed-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 05 6月, 2017 4 次提交
-
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NAmir Goldstein <amir73il@gmail.com> Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
-
由 Christoph Hellwig 提交于
For some file systems we still memcpy into it, but in various places this already allows us to use the proper uuid helpers. More to come.. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NAmir Goldstein <amir73il@gmail.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> (Changes to IMA/EVM) Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
-
由 Christoph Hellwig 提交于
This helper was only used by IMA of all things, which would get spurious errors if CONFIG_BLOCK is disabled. Just opencode the call there. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NAmir Goldstein <amir73il@gmail.com> Acked-by: NMimi Zohar <zohar@linux.vnet.ibm.com> Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
-
由 Christoph Hellwig 提交于
Hoist the libnvdimm helper as an inline helper to linux/uuid.h using an auxiliary const variable uuid_null in lib/uuid.c. [hch: also add the guid variant. Both do the same but I'd like to keep casts to a minimum] The common helper uses the new abstract type uuid_t * instead of u8 *. Suggested-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAmir Goldstein <amir73il@gmail.com> [hch: added guid_is_null] Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
-