- 27 9月, 2020 5 次提交
-
-
由 Amit Engel 提交于
A user may modify the kato by a set features cmd. To properly deal with races or a kato value of 0 (no keep alive enabled) change nvmet_set_feat_kato to first disable the timer, then set the value and then re-enable the timer. Signed-off-by: NAmit Engel <amit.engel@dell.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Mark Wunderlich 提交于
No real good need to spread queues artificially. Usually the target will serve multiple hosts, and it's better to run on the socket incoming cpu for better affinitization rather than spread queues on all online cpus. We rely on RSS to spread the work around sufficiently. Signed-off-by: NMark Wunderlich <mark.wunderlich@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Andy Shevchenko 提交于
It's unusual that we have enumeration by class in the middle of the table. It might potentially be problematic in the future if we add another entry after it. So, move class matching entry to be the last in the ID table. Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
When using linked list we have to open code the locking, search, and destroy operations with the loops even if data structure doesn't fall into the fast path. One of the main advantage of having XArray to store, search, and remove items is that it handles all the locking by itself, avoids the loops when using linked lists, provides clear API to replace the linked list's search and destroy loops. This patch replaces the ctrl->cel list with XArray and removes :- a. Extra code needed for the linked list for ctrl->cel item management such as nvme_find_cel(). b. Destroy loop in the nvme_free_ctrl(). c. Explicit insertion locking in the nvme_get_effects_log(). Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
Lift opening the file open/close code from nvme_ctrl_get_by_path into the caller, just keeping a simple nvme_ctrl_from_file() helper. Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> [hch: refactored a bit, split the bug fixes into a separate prep patch] Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NLogan Gunthorpe <logang@deltatee.com>
-
- 25 9月, 2020 24 次提交
-
-
由 Xiao Ni 提交于
For far layout, the discard region is not continuous on disks. So it needs far copies r10bio to cover all regions. It needs a way to know all r10bios have finish or not. Similar with raid10_sync_request, only the first r10bio master_bio records the discard bio. Other r10bios master_bio record the first r10bio. The first r10bio can finish after other r10bios finish and then return the discard bio. Signed-off-by: NXiao Ni <xni@redhat.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Xiao Ni 提交于
Now the discard request is split by chunk size. So it takes a long time to finish mkfs on disks which support discard function. This patch improve handling raid10 discard request. It uses the similar way with patch 29efc390 (md/md0: optimize raid0 discard handling). But it's a little complex than raid0. Because raid10 has different layout. If raid10 is offset layout and the discard request is smaller than stripe size. There are some holes when we submit discard bio to underlayer disks. For example: five disks (disk1 - disk5) D01 D02 D03 D04 D05 D05 D01 D02 D03 D04 D06 D07 D08 D09 D10 D10 D06 D07 D08 D09 The discard bio just wants to discard from D03 to D10. For disk3, there is a hole between D03 and D08. For disk4, there is a hole between D04 and D09. D03 is a chunk, raid10_write_request can handle one chunk perfectly. So the part that is not aligned with stripe size is still handled by raid10_write_request. If reshape is running when discard bio comes and the discard bio spans the reshape position, raid10_write_request is responsible to handle this discard bio. I did a test with this patch set. Without patch: time mkfs.xfs /dev/md0 real4m39.775s user0m0.000s sys0m0.298s With patch: time mkfs.xfs /dev/md0 real0m0.105s user0m0.000s sys0m0.007s nvme3n1 259:1 0 477G 0 disk └─nvme3n1p1 259:10 0 50G 0 part nvme4n1 259:2 0 477G 0 disk └─nvme4n1p1 259:11 0 50G 0 part nvme5n1 259:6 0 477G 0 disk └─nvme5n1p1 259:12 0 50G 0 part nvme2n1 259:9 0 477G 0 disk └─nvme2n1p1 259:15 0 50G 0 part nvme0n1 259:13 0 477G 0 disk └─nvme0n1p1 259:14 0 50G 0 part Reviewed-by: NColy Li <colyli@suse.de> Reviewed-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: NXiao Ni <xni@redhat.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Xiao Ni 提交于
The following patch will reuse these logics, so pull the same codes into one function. Signed-off-by: NXiao Ni <xni@redhat.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Xiao Ni 提交于
Now it allocs r10bio->devs[conf->copies]. Discard bio needs to submit to all member disks and it needs to use r10bio. So extend to r10bio->devs[geo.raid_disks]. Reviewed-by: NColy Li <colyli@suse.de> Signed-off-by: NXiao Ni <xni@redhat.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Xiao Ni 提交于
Move these logic from raid0.c to md.c, so that we can also use it in raid10.c. Reviewed-by: NColy Li <colyli@suse.de> Reviewed-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: NXiao Ni <xni@redhat.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Zhen Lei 提交于
#define RESYNC_SECTORS (RESYNC_BLOCK_SIZE >> 9) "RESYNC_BLOCK_SIZE/512" is equal to "RESYNC_BLOCK_SIZE >> 9", replace it with RESYNC_SECTORS. Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
When try to resize stripe_size, we also need to free old shared page array and allocate new. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
When reshape array, we try to reuse shared pages of old stripe_head, and allocate more for the new one if needed. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
In current implementation, grow_buffers() uses alloc_page() to allocate the buffers for each stripe_head, i.e. allocate a page for each dev[i] in stripe_head. After setting stripe_size as a configurable value by writing sysfs entry, it means that we always allocate 64K buffers, but just use 4K of them when stripe_size is 4K in 64KB arm64. To avoid wasting memory, we try to let multiple sh->dev share one real page. That means, multiple sh->dev[i].page will point to the only page with different offset. Example of 64K PAGE_SIZE and 4K stripe_size as following: 64K PAGE_SIZE +---+---+---+---+------------------------------+ | | | | | | | | | | +-+-+-+-+-+-+-+-+------------------------------+ ^ ^ ^ ^ | | | +----------------------------+ | | | | | | +-------------------+ | | | | | | +----------+ | | | | | | +-+ | | | | | | | +-----+-----+------+-----+------+-----+------+------+ sh | offset(0) | offset(4K) | offset(8K) | offset(12K) | + +-----------+------------+------------+-------------+ +----> dev[0].page dev[1].page dev[2].page dev[3].page A new 'pages' array will be added into stripe_head to record shared page used by this stripe_head. Allocate them when grow_buffers() and free them when shrink_buffers(). After trying to share page, the users of sh->dev[i].page need to take care of the related page offset: page of issued bio and page passed to xor compution functions. But thanks for previous different page offset supported. Here, we just need to set correct dev[i].offset. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
For now, asynchronous raid6 recovery calculate functions are require common offset for pages. But, we expect them to support different page offset after introducing stripe shared page. Do that by simplily adding page offset where each page address are referred. Then, replace the old interface with the new ones in raid6 and raid6test. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
For now, syndrome compute functions require common offset in the pages array. However, we expect them to support different offset when try to use shared page in the following. Simplily covert them by adding page offset where each page address are referred. Since the only caller of async_gen_syndrome() and async_syndrome_val() are in raid6, we don't want to reserve the old interface but modify the interface directly. After that, replacing old interfaces with new ones for raid6 and raid6test. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
We try to replace async_xor() and async_xor_val() with the new introduced interface async_xor_offs() and async_xor_val_offs() for raid456. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
ops_run_biofill() and ops_run_biodrain() will call async_copy_data() to copy sh->dev[i].page from or to bio page. For now, it implies the offset of dev[i].page is 0. But we want to support different page offset in the following. Thus, pass page offset to these functions and replace 'page_offset' with 'page_offset + poff'. No functional change. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Yufen Yu 提交于
Add a new member of offset into struct r5dev. It indicates the offset of related dev[i].page. For now, since each device have a privated page, the value is always 0. Thus, we set offset as 0 when allcate page in grow_buffers() and resize_stripes(). To support following different page offset, we try to use the page offset rather than '0' directly for async_memcpy() and ops_run_io(). We try to support different page offset for xor compution functions in the following. To avoid repeatly allocate a new array each time, we add a memory region into scribble buffer to record offset. No functional change. Signed-off-by: NYufen Yu <yuyufen@huawei.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 Xianting Tian 提交于
We alreday has the interface i_blocksize(), which can be used to get blocksize, so use it. Only calculate blocksize once and use it within read_page(). Signed-off-by: NXianting Tian <tian.xianting@h3c.com> Signed-off-by: NSong Liu <songliubraving@fb.com>
-
由 John Garry 提交于
Support a shared tag bitmap, whereby request tags are unique over all submission queues, and not just per submission queue. As such, per device total queue depth is normally hw_queue_depth * submit_queues, but hw_queue_depth when set. And a similar story for when shared_tags is set, where that is the queue depth over all null blk devices. Signed-off-by: NJohn Garry <john.garry@huawei.com> Tested-by: NDouglas Gilbert <dgilbert@interlog.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The BDI_CAP_STABLE_WRITES is one of the few bits of information in the backing_dev_info shared between the block drivers and the writeback code. To help untangling the dependency replace it with a queue flag and a superblock flag derived from it. This also helps with the case of e.g. a file system requiring stable writes due to its own checksumming, but not forcing it on other users of the block device like the swap code. One downside is that we an't support the stable_pages_required bdi attribute in sysfs anymore. It is replaced with a queue attribute which also is writable for easier testing. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
BDI_CAP_SYNCHRONOUS_IO is only checked in the swap code, and used to decided if ->rw_page can be used on a block device. Just check up for the method instead. The only complication is that zram needs a second set of block_device_operations as it can switch between modes that actually support ->rw_page and those who don't. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Drivers shouldn't really mess with the readahead size, as that is a VM concept. Instead set it based on the optimal I/O size by lifting the algorithm from the md driver when registering the disk. Also set bdi->io_pages there as well by applying the same scheme based on max_sectors. To ensure the limits work well for stacking drivers a new helper is added to update the readahead limits from the block limits, which is also called from disk_stack_limits. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: NColy Li <colyli@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The raid5 and raid10 drivers currently update the read-ahead size, but not the optimal I/O size on reshape. To prepare for deriving the read-ahead size from the optimal I/O size make sure it is updated as well. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: NSong Liu <song@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Set up a readahead size by default, as very few users have a good reason to change it. This means code, ecryptfs, and orangefs now set up the values while they were previously missing it, while ubifs, mtd and vboxsf manually set it to 0 to avoid readahead. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Acked-by: David Sterba <dsterba@suse.com> [btrfs] Acked-by: Richard Weinberger <richard@nod.at> [ubifs, mtd] Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
aoe forces a larger readahead size, but any reason to do larger I/O is not limited to readahead. Also set the optimal I/O size, and remove the local constants in favor of just using SZ_2G. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Inherit the optimal I/O size setting just like the readahead window, as any reason to do larger I/O does not apply to just readahead. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: NColy Li <colyli@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Ever since the switch to blk-mq, a lower device not used for VM writeback will not be marked congested, so the check will never trigger. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 24 9月, 2020 6 次提交
-
-
由 Christoph Hellwig 提交于
Use blkdev_get_by_dev instead of bdget_disk + blkdev_get. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NStefan Haberland <sth@linux.ibm.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Turn binding into a normal dev_t as the struct block device doesn't buy us anything and use blkdev_open_by_dev to actually open it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use blkdev_get_by_dev instead of bdgrab + blkdev_get. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Replace bdget + blkdev_get by blkdev_get_by_dev. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Remove code which has been dead since the initial commit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
We can only scan for partitions on the whole disk, so move the flag from struct block_device to struct gendisk. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 22 9月, 2020 5 次提交
-
-
由 Vladimir Oltean 提交于
The IS2 IP4_TCP_UDP key offsets do not correspond to the VSC7514 datasheet. Whether they work or not is unknown to me. On VSC9959 and VSC9953, with the same mistake and same discrepancy from the documentation, tc-flower src_port and dst_port rules did not work, so I am assuming the same is true here. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vladimir Oltean 提交于
Since these were copied from the Felix VCAP IS2 code, and only the offsets were adjusted, the order of the bit fields is still wrong. Fix it. Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xiaoliang Yang 提交于
Some of the IS2 IP4_TCP_UDP keys are not correct, like L4_DPORT, L4_SPORT and other L4 keys. This prevents offloaded tc-flower rules from matching on src_port and dst_port for TCP and UDP packets. Signed-off-by: NXiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Saeed Mahameed 提交于
Returning errno is a bug, fix that. Also fixes smatch warnings: drivers/net/ethernet/mellanox/mlx5/core/en/port.c:453 mlx5e_fec_in_caps() warn: signedness bug returning '(-95)' Fixes: 2132b71f ("net/mlx5e: Advertise globaly supported FEC modes") Reported-by: Nkernel test robot <lkp@intel.com> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com> Reviewed-by: NMoshe Shemesh <moshe@nvidia.com> Reviewed-by: NAya Levin <ayal@nvidia.com>
-
由 Saeed Mahameed 提交于
The spinlock only needed when accessing the channel's icosq, grab the lock after the buf allocation in resync_post_get_progress_params() to avoid kzalloc(GFP_KERNEL) in atomic context. Fixes: 0419d8c9 ("net/mlx5e: kTLS, Add kTLS RX resync support") Reported-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com> Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
-