- 02 2月, 2013 30 次提交
-
-
由 Fabio Baltieri 提交于
Fix governors code to set all cpu's cdbs->cpu to the the actual cpu id and use cur_policy->cpu istead of cdbs->cpu to track current governor's leader cpu. Reported-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NFabio Baltieri <fabio.baltieri@linaro.org> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Fabio Baltieri 提交于
Implement a generic helper function policy_is_shared() to replace the current dbs_sw_coordinated_cpus() at cpufreq level, so that it can be used by code other than cpufreq governors. Suggested-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NFabio Baltieri <fabio.baltieri@linaro.org> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Viresh Kumar 提交于
SPEAr cpufreq driver supports dual core Cortex-A9 SoC's, where cpus share policy structure. Whenever we update frequency of a cpu, we must notify all policy->cpus. Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Viresh Kumar 提交于
Documentation related to cpus and related_cpus is confusing and not very clear. Over that CPUFreq core has seen much changes recently. Lets update documentation and comments for cpus and related_cpus. Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Shawn Guo 提交于
As multiplatform build is being adopted by more and more ARM platforms, initcall function should be used very carefully. For example, when GENERIC_CPUFREQ_CPU0 is built in the kernel, cpu0_cpufreq_driver_init() will be called on all the platforms to initialize cpufreq-cpu0 driver. To eliminate this undesired the effect, the patch changes cpufreq-cpu0 driver to have it instantiated as a platform_driver. Then it will only run on platforms that create the platform_device "cpufreq-cpu0". Along with the change, it also changes cpu_dev to be &pdev->dev, so that managed functions can start working, and module build gets supported too. The highbank-cpufreq driver is also updated accordingly to adapt the changes on cpufreq-cpu0. Signed-off-by: NShawn Guo <shawn.guo@linaro.org> Reviewed-by: NViresh Kumar <viresh.kumar@linaro.org> Acked-by: NMark Langsdorf <mark.langsdorf@calxeda.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Fabio Baltieri 提交于
Drop unused arguments from dbs_timer_init and clean dbs_timer_exit and cpufreq_governor_dbs to remove non necessary special cases. Reported-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NFabio Baltieri <fabio.baltieri@linaro.org> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Viresh Kumar 提交于
Currently cpufreq_add_dev() firsts allocates policy, calls driver->init() and then checks if this CPU is already managed or not. And if it is already managed, its policy is freed. We can save all this if we somehow know that CPU is managed or not in advance. policy->related_cpus contains the list of all valid sibling CPUs of policy->cpu. We can check this to see if the current CPU is already managed. From now on, platforms don't really need to set related_cpus from their init() routines, as the same work is done by core too. If a platform driver needs to set the related_cpus mask with some additional CPUs, other than CPUs present in policy->cpus, they are free to do it, though, as we don't override anything. [rjw: Changelog] Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Tested-by: NShawn Guo <shawn.guo@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Viresh Kumar 提交于
This reverts commit 956f339 "cpufreq: Don't use cpu removed during cpufreq_driver_unregister". With the addition of the following commit, this change/variable is not required any more: commit b9ba2725343ae57add3f324dfa5074167f48de96 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Mon Jan 14 13:23:03 2013 +0000 cpufreq: Simplify __cpufreq_remove_dev() [rjw: Subject and changelog] Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Mark Langsdorf 提交于
Export cpufreq helpers in OPP to make the cpufreq-core0 and highbank-cpufreq drivers loadable as modules. Signed-off-by: NMark Langsdorf <mark.langsdorf@calxeda.com> Signed-off-by: NNishanth Menon <nm@ti.com> Acked-by: NShawn Guo <shawn.guo@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Nishanth Menon 提交于
We are GPLV2 library, so be clear in the symbols exported as well. Signed-off-by: NNishanth Menon <nm@ti.com> Acked-by: NShawn Guo <shawn.guo@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Mark Langsdorf 提交于
Highbank processors depend on the external ECME to perform voltage management based on a requested frequency. Communication between the A9 cores and the ECME happens over the pl320 IPC channel. Signed-off-by: NMark Langsdorf <mark.langsdorf@calxeda.com> Reviewed-by: NShawn Guo <shawn.guo@linaro.org> Reviewed-by: NMike Turquette <mturquette@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rob Herring 提交于
The pl320 IPC allows for interprocessor communication between the highbank A9 and the EnergyCore Management Engine. The pl320 implements a straightforward mailbox protocol. Signed-off-by: NMark Langsdorf <mark.langsdorf@calxeda.com> Signed-off-by: NRob Herring <rob.herring@calxeda.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Mark Langsdorf 提交于
The highbank clock will glitch with the current code if the clock rate is reset without relocking the PLL. Program the PLL correctly to prevent glitches. Signed-off-by: NMark Langsdorf <mark.langsdorf@calxeda.com> Signed-off-by: NRob Herring <rob.herring@calxeda.com> Acked-by: NMike Turquette <mturquette@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rob Herring 提交于
Move clk setup to twd_local_timer_common_register and rely on twd_timer_rate being 0 to force calibration if there is no clock. Remove common_setup_called as it is no longer needed. Signed-off-by: NRob Herring <rob.herring@calxeda.com> Signed-off-by: NMark Langsdorf <mark.langsdorf@calxeda.com> Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Borislav Petkov 提交于
Move function prototypes to a place where they logically fit better. Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Borislav Petkov 提交于
Make it hotplug-safe and cleanup formatting. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Borislav Petkov 提交于
Check whether we've actually already loaded acpi-cpufreq before requesting it. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Borislav Petkov 提交于
Add a helper function to return cpufreq_driver->name. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Borislav Petkov 提交于
Now that the majority of x86 CPUs out there are supported by acpi-cpufreq, we want it to load first and, in the AMD case, drop to powernow-k8 only on K8s. If, however, both powernow-k8 and acpi-cpufreq are built-in, the link order matters. Correct that. Signed-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Matthew Garrett 提交于
de3ed81d ("[CPUFREQ] Change link order of x86 cpufreq modules") changed cpufreq drivers link order so that powernow-k8 gets loaded first due to earlier K8s having BIOS bugs. However, now that acpi-cpufreq supports both AMD and Intel CPUs with HW P-states, we want to load it first, so that cases where acpi-cpufreq and powernow-k8 are both built-in and powernow-k8 initializing first, can be addressed. So, make sure that even if acpi-cpufreq gets loaded first, it errors out on K8s and powernow-k8 can be loaded then successfully. Signed-off-by: NMatthew Garrett <mjg59@srcf.ucam.org> References: http://lkml.kernel.org/r/20130118162347.GA31499@srcf.ucam.orgSigned-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Dirk Brandewie 提交于
When disable_cpufreq() is called some exported functions are still being used that do not have a check for cpufreq being disabled. Add a disabled check into cpufreq_cpu_get() to return NULL if cpufreq is disabled this covers most of the exported functions. For the exported functions that do not call cpufreq_cpu_get() add an explicit check. Signed-off-by: NDirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Viresh Kumar 提交于
__cpufreq_remove_dev() is called on multiple occasions: cpufreq_driver unregister and cpu removals. Current implementation of this routine is overly complex without much need. If the cpu to be removed is the policy->cpu, we remove the policy first and add all other cpus again from policy->cpus and then finally call __cpufreq_remove_dev() again to remove the cpu to be deleted. Haahhhh.. There exist a simple solution to removal of a cpu: - Simply use the old policy structure - update its fields like: policy->cpu, etc. - notify any users of cpufreq, which depend on changing policy->cpu Hence this patch, which tries to implement the above theory. It is tested well by myself on ARM big.LITTLE TC2 SoC, which has 5 cores (2 A15 and 3 A7). Both A15's share same struct policy and all A7's share same policy structure. Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Tested-by: NShawn Guo <shawn.guo@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Viresh Kumar 提交于
This patch fixes following sparse warning: drivers/cpufreq/spear-cpufreq.c:33:5: warning: symbol 'spear_cpufreq_verify' was not declared. Should it be static? Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Viresh Kumar 提交于
This is how the core works: cpufreq_driver_unregister() - subsys_interface_unregister() - for_each_cpu() call cpufreq_remove_dev(), i.e. 0,1,2,3,4 when we unregister. cpufreq_remove_dev(): - Remove policy node - Call cpufreq_add_dev() for next cpu, sharing mask with removed cpu. i.e. When cpu 0 is removed, we call it for cpu 1. And when called for cpu 2, we call it for cpu 3. - cpufreq_add_dev() would call cpufreq_driver->init() - init would return mask as AND of 2, 3 and 4 for cluster A7. - cpufreq core would do online_cpu && policy->cpus Here is the BUG(). Because cpu hasn't died but we have just unregistered the cpufreq driver, online cpu would still have cpu 2 in it. And so thing go bad again. Solution: Keep cpumask of cpus that are registered with cpufreq core and clear cpus when we get a call from subsys_interface_unregister() via cpufreq_remove_dev(). Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Viresh Kumar 提交于
Because cpufreq core and governors worry only about the online cpus, if a cpu is hot [un]plugged, we must notify governors about it, otherwise be ready to expect something unexpected. We already have notifiers in the form of CPUFREQ_GOV_START/CPUFREQ_GOV_STOP, we just need to call them now. Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Viresh Kumar 提交于
cpufreq core doesn't manage offline cpus and if driver->init() has returned mask including offline cpus, it may result in unwanted behavior by cpufreq core or governors. We need to get only online cpus in this mask. There are two places to fix this mask, cpufreq core and cpufreq driver. It makes sense to do this at common place and hence is done in core. Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Fabio Baltieri 提交于
Modify update_sampling_rate() to check, and eventually immediately schedule, all CPU's do_dbs_timer delayed work. This is required in case of software coordinated CPUs, as we now have a separate delayed work for each CPU. Signed-off-by: NFabio Baltieri <fabio.baltieri@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Fabio Baltieri 提交于
Modify conservative timer to not resample CPU utilization if recently sampled from another SW coordinated core. Signed-off-by: NFabio Baltieri <fabio.baltieri@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Fabio Baltieri 提交于
Modify ondemand timer to not resample CPU utilization if recently sampled from another SW coordinated core. Signed-off-by: NFabio Baltieri <fabio.baltieri@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rickard Andersson 提交于
This patch fixes a bug that occurred when we had load on a secondary CPU and the primary CPU was sleeping. Only one sampling timer was spawned and it was spawned as a deferred timer on the primary CPU, so when a secondary CPU had a change in load this was not detected by the cpufreq governor (both ondemand and conservative). This patch make sure that deferred timers are run on all CPUs in the case of software controlled CPUs that run on the same frequency. Signed-off-by: NRickard Andersson <rickard.andersson@stericsson.com> Signed-off-by: NFabio Baltieri <fabio.baltieri@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 01 2月, 2013 6 次提交
-
-
由 Linus Torvalds 提交于
-
git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm由 Linus Torvalds 提交于
Pull more device-mapper fixes from Alasdair G Kergon: "A fix for stacked dm thin devices and a fix for the new dm WRITE SAME support." * tag 'dm-3.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm: dm: fix write same requests counting dm thin: fix queue limits stacking
-
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid由 Linus Torvalds 提交于
PullHID fixes from Jiri Kosina: - fix i2c-hid and hidraw interaction, by Benjamin Tissoires - a quirk to make a particular device (Formosa IR receiver) work properly, by Nicholas Santos * 'for-3.8/upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: i2c-hid: fix i2c_hid_output_raw_report HID: usbhid: quirk for Formosa IR receiver HID: remove x bit from sensor doc
-
git://git.linux-nfs.org/projects/trondmy/linux-nfs由 Linus Torvalds 提交于
Pull NFS client bugfixes from Trond Myklebust: - Error reporting in nfs_xdev_mount incorrectly maps all errors to ENOMEM - Fix an NFSv4 refcounting issue - Fix a mount failure when the server reboots during NFSv4 trunking discovery - NFSv4.1 mounts may need to run the lease recovery thread. - Don't silently fail setattr() requests on mountpoints - Fix a SUNRPC socket/transport livelock and priority queue issue - We must handle NFS4ERR_DELAY when resetting the NFSv4.1 session. * tag 'nfs-for-3.8-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4.1: Handle NFS4ERR_DELAY when resetting the NFSv4.1 session SUNRPC: When changing the queue priority, ensure that we change the owner NFS: Don't silently fail setattr() requests on mountpoints NFSv4.1: Ensure that nfs41_walk_client_list() does start lease recovery NFSv4: Fix NFSv4 trunking discovery NFSv4: Fix NFSv4 reference counting for trunked sessions NFS: Fix error reporting in nfs_xdev_mount
-
git://git.linux-mips.org/pub/scm/ralf/upstream-linus由 Linus Torvalds 提交于
Pull MIPS updates from Ralf Baechle: "A number of fixes all across the MIPS tree. No area is particularly standing out and things have cooled down quite nicely for a release." * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: Function tracer: Fix broken function tracing mips: Move __virt_addr_valid() to a place for MIPS 64 MIPS: Netlogic: Fix UP compilation on XLR MIPS: AR71xx: Fix AR71XX_PCI_MEM_SIZE MIPS: AR724x: Fix AR724X_PCI_MEM_SIZE MIPS: Lantiq: Fix cp0_perfcount_irq mapping MIPS: DSP: Fix DSP mask for registers. MIPS: Fix build failure by adding definition of pfn_pmd(). MIPS: Octeon: Fix warning. MIPS: delay.c: Check BITS_PER_LONG instead of __SIZEOF_LONG__ MIPS: PNX833x: Fix comment. MIPS: Add struct p_format to union mips_instruction. MIPS: Export <asm/break.h>. MIPS: BCM47xx: Enable SSB prerequisite SSB_DRIVER_PCICORE. MIPS: BCM47xx: Select GPIOLIB for BCMA on bcm47xx platform MIPS: vpe.c: Fix null pointer dereference in print arguments.
-
由 Benjamin Tissoires 提交于
i2c_hid_output_raw_report is used by hidraw to forward set_report requests. The current implementation of i2c_hid_set_report needs to take the report_id as an argument. The report_id is stored in the first byte of the buffer in argument of i2c_hid_output_raw_report. Not removing the report_id from the given buffer adds this byte 2 times in the command, leading to a non working command. Reported-by: NAndrew Duggan <aduggan@synaptics.com> Signed-off-by: NBenjamin Tissoires <benjamin.tissoires@gmail.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 31 1月, 2013 4 次提交
-
-
由 Al Cooper 提交于
Function tracing is currently broken for all 32 bit MIPS platforms. When tracing is enabled, the kernel immediately hangs on boot. This is a result of commit b732d439 that changes the kernel/trace/Kconfig file so that is no longer forces FRAME_POINTER when FUNCTION_TRACING is enabled. MIPS frame pointers are generally considered to be useless because they cannot be used to unwind the stack. Unfortunately the MIPS function tracing code has bugs that are masked by the use of frame pointers. This commit fixes the bugs so that MIPS frame pointers don't need to be enabled. The bugs are a result of the odd calling sequence used to call the trace routine. This calling sequence is inserted into every traceable function when the tracing CONFIG option is enabled. This sequence is generated for 32bit MIPS platforms by the compiler via the "-pg" flag. Part of the sequence is "addiu sp,sp,-8" in the delay slot after every call to the trace routine "_mcount" (some legacy thing where 2 arguments used to be pushed on the stack). The _mcount routine is expected to adjust the sp by +8 before returning. So when not disabled, the original jalr and addiu will be there, so _mcount has to adjust sp. The problem is that when tracing is disabled for a function, the "jalr _mcount" instruction is replaced with a nop, but the "addiu sp,sp,-8" is still executed and the stack pointer is left trashed. When frame pointers are enabled the problem is masked because any access to the stack is done through the frame pointer and the stack pointer is restored from the frame pointer when the function returns. This patch writes two nops starting at the address of the "jalr _mcount" instruction whenever tracing is disabled. This means that the "addiu sp,sp.-8" will be converted to a nop along with the "jalr". When disabled, there will be two nops. This is SMP safe because the first time this happens is during ftrace_init() which is before any other processor has been started. Subsequent calls to enable/disable tracing when other CPUs ARE running will still be safe because the enable will only change the first nop to a "jalr" and the disable, while writing 2 nops, will only be changing the "jalr". This patch also stops using stop_machine() to call the tracer enable/disable routines and calls them directly because the routines are SMP safe. When the kernel first boots we have to be able to handle the gcc generated jalr, addui sequence until ftrace_init gets a chance to run and change the sequence. At this point mcount just adjusts the stack and returns. When ftrace_init runs, we convert the jalr/addui to nops. Then whenever tracing is enabled we convert the first nop to a "jalr mcount+8". The mcount+8 entry point skips the stack adjust. [ralf@linux-mips.org: Folded in Steven Rostedt's build fix.] Signed-off-by: NAl Cooper <alcooperx@gmail.com> Cc: rostedt@goodmis.org Cc: ddaney.cavm@gmail.com Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/4806/ Patchwork: https://patchwork.linux-mips.org/patch/4841/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Alasdair G Kergon 提交于
When processing write same requests, fix dm to send the configured number of WRITE SAME requests to the target rather than the number of discards, which is not always the same. Device-mapper WRITE SAME support was introduced by commit 23508a96 ("dm: add WRITE SAME support"). Signed-off-by: NAlasdair G Kergon <agk@redhat.com> Acked-by: NMike Snitzer <snitzer@redhat.com>
-
由 Steven Rostedt 提交于
Commit d3ce8843 "MIPS: Fix modpost error in modules attepting to use virt_addr_valid()" moved __virt_addr_valid() from a macro in a header file to a function in ioremap.c. But ioremap.c is only compiled for MIPS 32, and not for MIPS 64. When compiling for my yeeloong2, which supposedly supports hibernation, which compiles kernel/power/snapshot.c which calls virt_addr_valid(), I got this error: LD init/built-in.o kernel/built-in.o: In function `memory_bm_free': snapshot.c:(.text+0x4c9c4): undefined reference to `__virt_addr_valid' snapshot.c:(.text+0x4ca58): undefined reference to `__virt_addr_valid' kernel/built-in.o: In function `snapshot_write_next': (.text+0x4e44c): undefined reference to `__virt_addr_valid' kernel/built-in.o: In function `snapshot_write_next': (.text+0x4e890): undefined reference to `__virt_addr_valid' make[1]: *** [vmlinux] Error 1 make: *** [sub-make] Error 2 I suspect that __virt_addr_valid() is fine for mips 64. I moved it to mmap.c such that it gets compiled for mips 64 and 32. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/4842/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
-
由 Mike Snitzer 提交于
thin_io_hints() is blindly copying the queue limits from the thin-pool which can lead to incorrect limits being set. The fix here simply deletes the thin_io_hints() hook which leaves the existing stacking infrastructure to set the limits correctly. When a thin-pool uses an MD device for the data device a thin device from the thin-pool must respect MD's constraints about disallowing a bio from spanning multiple chunks. Otherwise we can see problems. If the raid0 chunksize is 1152K and thin-pool chunksize is 256K I see the following md/raid0 error (with extra debug tracing added to thin_endio) when mkfs.xfs is executed against the thin device: md/raid0:md99: make_request bug: can't convert block across chunks or bigger than 1152k 6688 127 device-mapper: thin: bio sector=2080 err=-5 bi_size=130560 bi_rw=17 bi_vcnt=32 bi_idx=0 This extra DM debugging shows that the failing bio is spanning across the first and second logical 1152K chunk (sector 2080 + 255 takes the bio beyond the first chunk's boundary of sector 2304). So the bio splitting that DM is doing clearly isn't respecting the MD limits. max_hw_sectors_kb is 127 for both the thin-pool and thin device (queue_max_hw_sectors returns 255 so we'll excuse sysfs's lack of precision). So this explains why bi_size is 130560. But the thin device's max_hw_sectors_kb should be 4 (PAGE_SIZE) given that it doesn't have a .merge function (for bio_add_page to consult indirectly via dm_merge_bvec) yet the thin-pool does sit above an MD device that has a compulsory merge_bvec_fn. This scenario is exactly why DM must resort to sending single PAGE_SIZE bios to the underlying layer. Some additional context for this is available in the header for commit 8cbeb67a ("dm: avoid unsupported spanning of md stripe boundaries"). Long story short, the reason a thin device doesn't properly get configured to have a max_hw_sectors_kb of 4 (PAGE_SIZE) is that thin_io_hints() is blindly copying the queue limits from the thin-pool device directly to the thin device's queue limits. Fix this by eliminating thin_io_hints. Doing so is safe because the block layer's queue limits stacking already enables the upper level thin device to inherit the thin-pool device's discard and minimum_io_size and optimal_io_size limits that get set in pool_io_hints. But avoiding the queue limits copy allows the thin and thin-pool limits to be different where it is important, namely max_hw_sectors_kb. Reported-by: NDaniel Browning <db@kavod.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-