- 16 4月, 2015 23 次提交
-
-
由 Ben Collins 提交于
I suspect this doesn't show up for most anyone because software algorithms typically don't have a sense of being too busy. However, when working with the Freescale CAAM driver it will return -EBUSY on occasion under heavy -- which resulted in dm-crypt deadlock. After checking the logic in some other drivers, the scheme for crypt_convert() and it's callback, kcryptd_async_done(), were not correctly laid out to properly handle -EBUSY or -EINPROGRESS. Fix this by using the completion for both -EBUSY and -EINPROGRESS. Now crypt_convert()'s use of completion is comparable to af_alg_wait_for_completion(). Similarly, kcryptd_async_done() follows the pattern used in af_alg_complete(). Before this fix dm-crypt would lockup within 1-2 minutes running with the CAAM driver. Fix was regression tested against software algorithms on PPC32 and x86_64, and things seem perfectly happy there as well. Signed-off-by: NBen Collins <ben.c@servergy.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
-
由 Mike Snitzer 提交于
Commit 003b5c57 ("block: Convert drivers to immutable biovecs") stopped short of changing dm-crypt to leverage the fact that the biovec array of a bio will no longer be modified. Switch to using bio_clone_fast() when cloning bios for decryption after read. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Milan Broz 提交于
Cryptsetup home page moved to GitLab. Also remove link to abandonded Truecrypt page. Signed-off-by: NMilan Broz <gmazyland@gmail.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Josef Bacik 提交于
Introduce a new target that is meant for file system developers to test file system integrity at particular points in the life of a file system. We capture all write requests and associated data and log them to a separate device for later replay. There is a userspace utility to do this replay. The idea behind this is to give file system developers a tool to verify that the file system is always consistent. Signed-off-by: NJosef Bacik <jbacik@fb.com> Reviewed-by: NZach Brown <zab@zabbo.net> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Joe Perches 提交于
Use the normal return values for bool functions. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Sami Tolvanen 提交于
Add device specific modes to dm-verity to specify how corrupted blocks should be handled. The following modes are defined: - DM_VERITY_MODE_EIO is the default behavior, where reading a corrupted block results in -EIO. - DM_VERITY_MODE_LOGGING only logs corrupted blocks, but does not block the read. - DM_VERITY_MODE_RESTART calls kernel_restart when a corrupted block is discovered. In addition, each mode sends a uevent to notify userspace of corruption and to allow further recovery actions. The driver defaults to previous behavior (DM_VERITY_MODE_EIO) and other modes can be enabled with an additional parameter to the verity table. Signed-off-by: NSami Tolvanen <samitolvanen@google.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
The 'trim' message wasn't ever implemented. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Nicholas Mc Guire 提交于
Converting milliseconds to jiffies by "val * HZ / 1000" is technically OK but msecs_to_jiffies(val) is the cleaner solution and handles all corner cases correctly. Signed-off-by: NNicholas Mc Guire <hofrat@osadl.org> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Nicholas Mc Guire 提交于
This fixes up a compile warning [-Wunused-but-set-variable] - given the comment in userspace_set_region_sync() the non-reporting of errors is intentional so the return value can be dropped to make gcc happy. Also, fix typo in comment. Signed-off-by: NNicholas Mc Guire <hofrat@osadl.org> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Nicholas Mc Guire 提交于
Return type of wait_for_completion_timeout() is unsigned long not int. An appropriately named unsigned long is added and the assignment fixed. Signed-off-by: NNicholas Mc Guire <hofrat@osadl.org> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Dan Ehrenberg 提交于
If a device is used as the root filesystem, it can't be built off of devices which are within the root filesystem (just like command line arguments to root=). For this reason, Linux has a pseudo-filesystem for root= and MD initialization (based on the function name_to_dev_t) which handles different ways of specifying devices including PARTUUID and major:minor. Switch to using name_to_dev_t() in dm_get_device(). Rather than having DM assume that all things which are not major:minor are paths in an already-mounted filesystem, change dm_get_device() to first attempt to look up the device in the filesystem, and if not found it will fall back to using name_to_dev_t(). In terms of backwards compatibility, there are some cases where behavior will be different: - If you have a file in the current working directory named 1:2 and you initialze DM there, then it will try to use that file rather than the disk with that major:minor pair as a backing device. - Similarly for other bdev types which name_to_dev_t() knows how to interpret, the previous behavior was to repeatedly check for the existence of the file (e.g., while waiting for rootfs to come up) but the new behavior is to use the name_to_dev_t() interpretation. For example, if you have a file named /dev/ubiblock0_0 which is a symlink to /dev/sda3, but it is not yet present when DM starts to initialize, then the name_to_dev_t() interpretation will take precedence. These incompatibilities would only show up in really strange setups with bad practices so we shouldn't have to worry about them. Signed-off-by: NDan Ehrenberg <dehrenberg@chromium.org> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Dan Ehrenberg 提交于
In the kernel command-line, previously, root=1:2jakshflaksjdhfa would be accepted and interpreted just like root=1:2. This patch adds stricter checking so that additional characters after major:minor are rejected by root=. The goal of this change is to help in unifying DM's interpretation of its block device argument by using existing kernel code (name_to_dev_t). But DM rejects malformed major:minor pairs, it seems reasonable for root= to reject them as well. Signed-off-by: NDan Ehrenberg <dehrenberg@chromium.org> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Dan Ehrenberg 提交于
DM will switch its device lookup code to using name_to_dev_t() so it must be exported. Also, the @name argument should be marked const. Signed-off-by: NDan Ehrenberg <dehrenberg@chromium.org> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Request-based DM's blk-mq support defaults to off; but a user can easily change the default using the dm_mod.use_blk_mq module/boot option. Also, you can check what mode a given request-based DM device is using with: cat /sys/block/dm-X/dm/use_blk_mq This change enabled further cleanup and reduced work (e.g. the md->io_pool and md->rq_pool isn't created if using blk-mq). Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
dm_mq_queue_rq() is in atomic context so care must be taken to not sleep -- as such GFP_ATOMIC is used for the md->bs bioset allocations and dm-mpath's call to blk_get_request(). In the future the bioset allocations will hopefully go away (by removing support for partial completions of bios in a cloned request). Also prepare for supporting DM blk-mq ontop of old-style request_fn device(s) if a new dm-mod 'use_blk_mq' parameter is set. The kthread will still be used to queue work if blk-mq is used ontop of old-style request_fn device(s). Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Commit e5863d9a ("dm: allocate requests in target when stacking on blk-mq devices") served as the first step toward fully utilizing blk-mq in request-based DM -- it enabled stacking an old-style (request_fn) request_queue ontop of the underlying blk-mq device(s). That first step didn't improve performance of DM multipath ontop of fast blk-mq devices (e.g. NVMe) because the top-level old-style request_queue was severely limited by the queue_lock. The second step offered here enables stacking a blk-mq request_queue ontop of the underlying blk-mq device(s). This unlocks significant performance gains on fast blk-mq devices, Keith Busch tested on his NVMe testbed and offered this really positive news: "Just providing a performance update. All my fio tests are getting roughly equal performance whether accessed through the raw block device or the multipath device mapper (~470k IOPS). I could only push ~20% of the raw iops through dm before this conversion, so this latest tree is looking really solid from a performance standpoint." Signed-off-by: NMike Snitzer <snitzer@redhat.com> Tested-by: NKeith Busch <keith.busch@intel.com>
-
由 Mike Snitzer 提交于
Otherwise, for sequential workloads, the dm_request_fn can allow excessive request merging at the expense of increased service time. Add a per-device sysfs attribute to allow the user to control how long a request, that is a reasonable merge candidate, can be queued on the request queue. The resolution of this request dispatch deadline is in microseconds (ranging from 1 to 100000 usecs), to set a 20us deadline: echo 20 > /sys/block/dm-7/dm/rq_based_seq_io_merge_deadline The dm_request_fn's merge heuristic and associated extra accounting is disabled by default (rq_based_seq_io_merge_deadline is 0). This sysfs attribute is not applicable to bio-based DM devices so it will only ever report 0 for them. By allowing a request to remain on the queue it will block others requests on the queue. But introducing a short dequeue delay has proven very effective at enabling certain sequential IO workloads on really fast, yet IOPS constrained, devices to build up slightly larger IOs -- yielding 90+% throughput improvements. Having precise control over the time taken to wait for larger requests to build affords control beyond that of waiting for certain IO sizes to accumulate (which would require a deadline anyway). This knob will only ever make sense with sequential IO workloads and the particular value used is storage configuration specific. Given the expected niche use-case for when this knob is useful it has been deemed acceptable to expose this relatively crude method for crafting optimal IO on specific storage -- especially given the solution is simple yet effective. In the context of DM multipath, it is advisable to tune this sysfs attribute to a value that offers the best performance for the common case (e.g. if 4 paths are expected active, tune for that; if paths fail then performance may be slightly reduced). Alternatives were explored to have request-based DM autotune this value (e.g. if/when paths fail) but they were quickly deemed too fragile and complex to warrant further design and development time. If this problem proves more common as faster storage emerges we'll have to look at elevating a generic solution into the block core. Tested-by: NShiva Krishna Merla <shivakrishna.merla@netapp.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Add DM_ATTR_RW() macro and establish .store method in dm_sysfs_ops. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Request-based DM's dm_request_fn() is so fast to pull requests off the queue that steps need to be taken to promote merging by avoiding request processing if it makes sense. If the current request would've merged with previous request let the current request stay on the queue longer. Suggested-by: NJens Axboe <axboe@fb.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Commit 7eaceacc ("block: remove per-queue plugging") didn't justify DM's use of a 100ms delay; such an extended delay is a liability when there is reason to re-kick the queue. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
In request-based DM's dm_request_fn(), if blk_peek_request() returns NULL just return. Avoids unnecessary blk_delay_queue(). Reported-by: NJens Axboe <axboe@fb.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
On really fast storage it can be beneficial to delay running the request_queue to allow the elevator more opportunity to merge requests. Otherwise, it has been observed that requests are being sent to q->request_fn much quicker than is ideal on IOPS-bound backends. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
The old dm_request() method used for q->make_request_fn had a branch for request-based DM support but it isn't needed given that dm_init_request_based_queue() sets it to the standard blk_queue_bio() anyway. Cleanup dm_init_md_queue() to be DM device-type agnostic and have dm_setup_md_queue() properly finish queue setup based on DM device-type (bio-based vs request-based). A followup block patch can be made to remove the export for blk_queue_bio() now that DM no longer calls it directly. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
- 01 4月, 2015 11 次提交
-
-
由 Mike Snitzer 提交于
DM multipath is the only caller of blk_lld_busy() -- which calls a queue's lld_busy_fn hook. Request-based DM doesn't support stacking multipath devices so there is no reason to register the lld_busy_fn hook on a multipath device's queue using blk_queue_lld_busy(). As such, remove functions dm_lld_busy and dm_table_any_busy_target. Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
There is no need for DM to export a wrapper around the already exported blk_lld_busy(). Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
__dm_get_module_param() could be useful for future DM module parameters besides those related to "reserved_ios". Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Joe Thornber 提交于
Writeback takes out a lock on the cache block, so will increase the latency for any concurrent io. This patch works by placing 2 sentinel objects on each level of the multiqueues. Every WRITEBACK_PERIOD the oldest sentinel gets moved to the newest end of the queue level. When looking for writeback work: if less than 25% of the cache is clean: we select the oldest object with the lowest hit count otherwise: we select the oldest object that is not past a writeback sentinel. Signed-off-by: NJoe Thornber <ejt@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Joe Thornber 提交于
Remove to stop wasting memory. Signed-off-by: NJoe Thornber <ejt@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Joe Thornber 提交于
A sentinel object is placed on each level of the multiqueues. When an object is hit it is requeued behind the sentinel. When the tick is incremented we iterate through all objects behind the sentinel and update the hit_count, then reposition the sentinel at the very back. This saves memory by avoiding tracking the tick explicitly for every struct entry object in the multiqueues. Signed-off-by: NJoe Thornber <ejt@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Joe Thornber 提交于
queue_shift_down() didn't adjust the hit_counts to the new levels, so it just had the effect of scrambling levels. Signed-off-by: NJoe Thornber <ejt@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Joe Thornber 提交于
Small optimisation, now queue_empty() doesn't need to walk all levels of the multiqueue. Signed-off-by: NJoe Thornber <ejt@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Use a single slab cache to allocate a mempool for each dirty-log. This _should_ eliminate DM's need for io_schedule_timeout() in mempool_alloc(); so io_schedule() should be sufficient now. Also, rename struct flush_entry to dm_dirty_log_flush_entry to allow KMEM_CACHE() to create a meaningful global name for the slab cache. Also, eliminate some holes in struct log_c by rearranging members. Signed-off-by: NMike Snitzer <snitzer@redhat.com> Acked-by: NHeinz Mauelshagen <heinzm@redhat.com>
-
由 Mike Snitzer 提交于
-
- 31 3月, 2015 1 次提交
-
-
由 Mike Snitzer 提交于
Linux 3.19 commit 69c953c8 ("lib/lcm.c: lcm(n,0)=lcm(0,n) is 0, not n") caused blk_stack_limits() to not properly stack queue_limits for stacked devices (e.g. DM). Fix this regression by establishing lcm_not_zero() and switching blk_stack_limits() over to using it. DM uses blk_set_stacking_limits() to establish the initial top-level queue_limits that are then built up based on underlying devices' limits using blk_stack_limits(). In the case of optimal_io_size (io_opt) blk_set_stacking_limits() establishes a default value of 0. With commit 69c953c8, lcm(0, n) is no longer n, which compromises proper stacking of the underlying devices' io_opt. Test: $ modprobe scsi_debug dev_size_mb=10 num_tgts=1 opt_blks=1536 $ cat /sys/block/sde/queue/optimal_io_size 786432 $ dmsetup create node --table "0 100 linear /dev/sde 0" Before this fix: $ cat /sys/block/dm-5/queue/optimal_io_size 0 After this fix: $ cat /sys/block/dm-5/queue/optimal_io_size 786432 Signed-off-by: NMike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 3.19+ Acked-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 30 3月, 2015 5 次提交
-
-
由 Wei Fang 提交于
Don't assign ->rq_timeout twice. Signed-off-by: NWei Fang <fangwei1@huawei.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Xiaoguang Wang 提交于
At the beginning of blk_mq_alloc_tag_set(), we have already checked whether 'set->nr_hw_queues' is zero, so here remove this redundant check. Signed-off-by: NXiaoguang Wang <wangxg.fnst@cn.fujitsu.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Linus Torvalds 提交于
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc由 Linus Torvalds 提交于
Pull ARM SoC fixes from Olof Johansson: "The latest and greatest fixes for ARM platform code. Worth pointing out are: - Lines-wise, largest is a PXA fix for dealing with interrupts on DT that was quite broken. It's still newish code so while we could have held this off, it seemed appropriate to include now - Some GPIO fixes for OMAP platforms added a few lines. This was also fixes for code recently added (this release). - Small OMAP timer fix to behave better with partially upstreamed platforms, which is quite welcome. - Allwinner fixes about operating point control, reducing overclocking in some cases for better stability. plus a handful of other smaller fixes across the map" * tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: arm64: juno: Fix misleading name of UART reference clock ARM: dts: sunxi: Remove overclocked/overvoltaged OPP ARM: dts: sun4i: a10-lime: Override and remove 1008MHz OPP setting ARM: socfpga: dts: fix spi1 interrupt ARM: dts: Fix gpio interrupts for dm816x ARM: dts: dra7: remove ti,hwmod property from pcie phy ARM: OMAP: dmtimer: disable pm runtime on remove ARM: OMAP: dmtimer: check for pm_runtime_get_sync() failure ARM: OMAP2+: Fix socbus family info for AM33xx devices ARM: dts: omap3: Add missing dmas for crypto ARM: dts: rockchip: disable gmac by default in rk3288.dtsi MAINTAINERS: add rockchip regexp to the ARM/Rockchip entry ARM: pxa: fix pxa interrupts handling in DT ARM: pxa: Fix typo in zeus.c ARM: sunxi: Have ARCH_SUNXI select RESET_CONTROLLER for clock driver usage
-
由 Olof Johansson 提交于
Merge tag 'sunxi-fixes-for-4.0' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into fixes Allwinner fixes for 4.0 There's a few fixes to merge for 4.0, one to add a select in the machine Kconfig option to fix a potential build failure, and two fixing cpufreq related issues. * tag 'sunxi-fixes-for-4.0' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux: ARM: dts: sunxi: Remove overclocked/overvoltaged OPP ARM: dts: sun4i: a10-lime: Override and remove 1008MHz OPP setting ARM: sunxi: Have ARCH_SUNXI select RESET_CONTROLLER for clock driver usage Signed-off-by: NOlof Johansson <olof@lixom.net>
-