- 06 3月, 2022 2 次提交
-
-
由 Mingzhe Zou 提交于
When multiple threads to check btree nodes in parallel, the main thread wait for all threads to stop or CACHE_SET_IO_DISABLE flag: wait_event_interruptible(check_state->wait, atomic_read(&check_state->started) == 0 || test_bit(CACHE_SET_IO_DISABLE, &c->flags)); However, the bch_btree_node_read and bch_btree_node_read_done maybe call bch_cache_set_error, then the CACHE_SET_IO_DISABLE will be set. If the flag already set, the main thread return error. At the same time, maybe some threads still running and read NULL pointer, the kernel will crash. This patch change the event wait condition, the main thread must wait for all threads to stop. Fixes: 8e710227 ("bcache: make bch_btree_check() to be multithreaded") Signed-off-by: NMingzhe Zou <mingzhe.zou@easystack.cn> Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: NColy Li <colyli@suse.de>
-
由 Mingzhe Zou 提交于
When attaching a cached device (a.k.a backing device) to a cache device, bch_sectors_dirty_init() is called to count dirty sectors and stripes (see what bcache_dev_sectors_dirty_add() does) on the cache device. When bcache_dev_sectors_dirty_add() is called, set_bit(stripe, d->full_dirty_stripes) or clear_bit(stripe, d->full_dirty_stripes) operation will always be performed. In full_dirty_stripes, each 1bit represents stripe_size (8192) sectors (512B), so 1bit=4MB (8192*512), and each CPU cache line=64B=512bit=2048MB. When 20 threads process a cached disk with 100G dirty data, a single thread processes about 23M at a time, and 20 threads total 460M. These full_dirty_stripes bits corresponding to the 460M data is likely to fall in the same CPU cache line. When one of these threads performs a set_bit or clear_bit operation, the same CPU cache line of other threads will become invalid and must read the full_dirty_stripes from the main memory again. Compared with single thread, the time of a bcache_dev_sectors_dirty_add() call is increased by about 50 times in our test (100G dirty data, 20 threads, bcache_dev_sectors_dirty_add() is called more than 20 million times). This patch tries to test_bit before set_bit or clear_bit operation. Therefore, a lot of force set and clear operations will be avoided, and most of bcache_dev_sectors_dirty_add() calls will only read CPU cache line. Signed-off-by: NMingzhe Zou <mingzhe.zou@easystack.cn> Signed-off-by: NColy Li <colyli@suse.de>
-
- 05 3月, 2022 10 次提交
-
-
由 Christoph Hellwig 提交于
Use the helpers instead of open coding them. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-11-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Using local kmaps slightly reduces the chances to stray writes, and the bvec interface cleans up the code a little bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-10-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Using local kmaps slightly reduces the chances to stray writes, and the bvec interface cleans up the code a little bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Using local kmaps slightly reduces the chances to stray writes, and the bvec interface cleans up the code a little bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-8-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Using local kmaps slightly reduces the chances to stray writes, and the bvec interface cleans up the code a little bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Using local kmaps slightly reduces the chances to stray writes, and the bvec interface cleans up the code a little bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use memcpy_from_bvec instead of open coding the logic. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use the proper helper instead of open coding the copy. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Using local kmaps slightly reduces the chances to stray writes, and the bvec interface cleans up the code a little bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220303111905.321089-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Using local kmaps slightly reduces the chances to stray writes, and the bvec interface cleans up the code a little bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NIra Weiny <ira.weiny@intel.com> Acked-by: NMax Filippov <jcmvbkbc@gmail.com> Link: https://lore.kernel.org/r/20220303111905.321089-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 03 3月, 2022 1 次提交
-
-
git://git.infradead.org/nvme由 Jens Axboe 提交于
Pull NVMe updates from Christoph: "nvme updates for Linux 5.18 - add vectored-io support for user-passthrough (Kanchan Joshi) - add verbose error logging (Alan Adamson) - support buffered I/O on block devices in nvmet (Chaitanya Kulkarni) - central discovery controller support (Martin Belanger) - fix and extended the globally unique idenfier validation (me) - move away from the deprecated IDA APIs (Sagi Grimberg) - misc code cleanup (Keith Busch, Max Gurtovoy, Qinghua Jin, Chaitanya Kulkarni)" * tag 'nvme-5.18-2022-03-03' of git://git.infradead.org/nvme: (27 commits) nvme: check that EUI/GUID/UUID are globally unique nvme: check for duplicate identifiers earlier nvme: fix the check for duplicate unique identifiers nvme: cleanup __nvme_check_ids nvme: remove nssa from struct nvme_ctrl nvme: explicitly set non-error for directives nvme: expose cntrltype and dctype through sysfs nvme: send uevent on connection up nvme: add vectored-io support for user-passthrough nvme: add verbose error logging nvme: add a helper to initialize connect_q nvme-rdma: add helpers for mapping/unmapping request nvmet-tcp: replace ida_simple[get|remove] with the simler ida_[alloc|free] nvmet-rdma: replace ida_simple[get|remove] with the simler ida_[alloc|free] nvmet-fc: replace ida_simple[get|remove] with the simler ida_[alloc|free] nvmet: replace ida_simple[get|remove] with the simler ida_[alloc|free] nvme-fc: replace ida_simple[get|remove] with the simler ida_[alloc|free] nvme: replace ida_simple[get|remove] with the simler ida_[alloc|free] nvmet: allow bdev in buffered_io mode nvmet: use i_size_read() to set size for file-ns ...
-
- 28 2月, 2022 27 次提交
-
-
由 Christoph Hellwig 提交于
Add a check to verify that the unique identifiers are unique globally in addition to the existing check that verifies that they are unique inside a single subsystem. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
-
由 Christoph Hellwig 提交于
Lift the check for duplicate identifiers into nvme_init_ns_head, which avoids pointless error unwinding in case they don't match, and also matches where we check identifier validity for the multipath case. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
-
由 Christoph Hellwig 提交于
nvme_subsys_check_duplicate_ids should needs to return an error if any of the identifiers matches, not just if all of them match. But it does not need to and should not look at the CSI value for this sanity check. Rewrite the logic to be separate from nvme_ns_ids_equal and optimize it by reducing duplicate checks for non-present identifiers. Fixes: ed754e5d ("nvme: track shared namespaces") Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
-
由 Christoph Hellwig 提交于
Pass the actual nvme_ns_ids used for the comparison instead of the ns_head that isn't needed and use a more descriptive function name. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
-
由 Keith Busch 提交于
The reported number of streams is not used outside the function that gets it, so no need to stash it in the controller structure. Use a local variable instead. Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Keith Busch 提交于
Stream directives is an optional feature. It is not an error if a controller doesn't support as many as the kernel can optionally use. Explicitly set the non-error return value on this condition with a comment explaining why. Note, the return value was already 0 in this condition, so the setting is redundant. This patch should just silence bots that falsely believe the condition contains an error omission. Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Martin Belanger 提交于
TP8010 introduces the Discovery Controller Type attribute (dctype). The dctype is returned in the response to the Identify command. This patch exposes the dctype through the sysfs. Since the dctype depends on the Controller Type (cntrltype), another attribute of the Identify response, the patch also exposes the cntrltype as well. The dctype will only be displayed for discovery controllers. A note about the naming of this attribute: Although TP8010 calls this attribute the Discovery Controller Type, note that the dctype is now part of the response to the Identify command for all controller types. I/O, Discovery, and Admin controllers all share the same Identify response PDU structure. Non-discovery controllers as well as pre-TP8010 discovery controllers will continue to set this field to 0 (which has always been the default for reserved bytes). Per TP8010, the value 0 now means "Discovery controller type is not reported" instead of "Reserved". One could argue that this definition is correct even for non-discovery controllers, and by extension, exposing it in the sysfs for non-discovery controllers is appropriate. Signed-off-by: NMartin Belanger <martin.belanger@dell.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NJohn Meneghini <jmeneghi@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Martin Belanger 提交于
When connectivity with a controller is lost, the driver will keep trying to reconnect once every 10 sec. When connection is restored, user-space apps need to be informed so that they can take proper action. For example, TP8010 introduces the DIM PDU, which is used to register with a discovery controller (DC). The DIM PDU is sent from user-space. The DIM PDU must be sent every time a connection is established with a DC. Therefore, the kernel must tell user-space apps when connection is restored so that registration can happen. The uevent sent is a "change" uevent with environmental data set to: "NVME_EVENT=connected". Signed-off-by: NMartin Belanger <martin.belanger@dell.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NJohn Meneghini <jmeneghi@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Kanchan Joshi 提交于
Add a new NVME_IOCTL_IO64_CMD_VEC ioctl that works like the existing NVME_IOCTL_IO64_CMD ioctl except that it takes and array of iovecs and thus supports vectored I/O. - cmd.addr is base address of user iovec array - cmd.vec_cnt is count of iovec array elements This patch does not include vectored-variant for admin-commands as most of them are light on buffers and likely to have low invocation frequency. Signed-off-by: NKanchan Joshi <joshi.k@samsung.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Alan Adamson 提交于
Improves logging of NVMe errors. If NVME_VERBOSE_ERRORS is configured, a verbose description of the error is logged, otherwise only status codes/bits is logged. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> [kch]: fix several nits, cosmetics, and trim down code. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NAlan Adamson <alan.adamson@oracle.com> Reviewed-by: NHimanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
Add and use helper to remove duplicate code for fabrics connect_q initialization and error handling for all the transports. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Max Gurtovoy 提交于
Introduce nvme_rdma_dma_map_req/nvme_rdma_dma_unmap_req helper functions to improve code readability and ease on the error flow. Reviewed-by: NIsrael Rukshin <israelr@nvidia.com> Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
ida_simple_[get|remove] are wrappers anyways. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
ida_simple_[get|remove] are wrappers anyways. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
ida_simple_[get|remove] are wrappers anyways. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
ida_simple_[get|remove] are wrappers anyways. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
ida_simple_[get|remove] are wrappers anyways. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
ida_simple_[get|remove] are wrappers anyways. Also, use ida_alloc_min with the ns_ida as namespace enumeration starts with 1. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
Allow block device to be configured in the buffered I/O mode by using the file backend. In this way now we can use cache for the block device namespace which shows significant performance improvement. We update the block device ns enable function and return early when buffered_io flag is set. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
Instead of calling vfs_getattr() use i_size_read() to read the size of file so we can read the size of not only file type but also block type with one call. This is needed to implement buffered_io support for the NVMeOF block device backend. We also change return type of function nvmet_file_ns_revalidate() from int to void, since this function does not return any meaning value. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
Braces are not required for enum value NVME_SC_CONNECT_INVALID_PARAM when used on the switch-case statement, remove the braces. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
Remove zeroout memeset call & zeroout local variable cmd at the time of declaration in nvmf_ref_read32() similar to what we have done in nvmf_reg_read64(), nvmf_reg_write32(), nvmf_connect_admin_queue(), and nvmf_connect_io_queue(). Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
Loop variable i will never have a negative value, so use unsigned int type instaed of int. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
Loop variable i will never have a negative value, so use unsigned int type instaed of int. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
In function nvme_execute_rq() we don't use gendisk parameter at all. Remove the unsed parameter and adjust the calls. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
It is not a good practice to have a semicolon at the end of the function definition. Remove it from nvme_pr_type(). Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Qinghua Jin 提交于
subsytem -> subsystem Signed-off-by: NQinghua Jin <qhjin.dev@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-