- 28 2月, 2010 3 次提交
-
-
由 Ian Campbell 提交于
Now that both Xen and VMI disable allocations of PTE pages from high memory this paravirt op serves no further purpose. This effectively reverts ce6234b5 "add kmap_atomic_pte for mapping highpte pages". Signed-off-by: NIan Campbell <ian.campbell@citrix.com> LKML-Reference: <1267204562-11844-3-git-send-email-ian.campbell@citrix.com> Acked-by: NAlok Kataria <akataria@vmware.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Ian Campbell 提交于
Preventing HIGHPTE allocations under VMI will allow us to remove the kmap_atomic_pte paravirt op. Signed-off-by: NIan Campbell <ian.campbell@citrix.com> LKML-Reference: <1267204562-11844-2-git-send-email-ian.campbell@citrix.com> Acked-by: NAlok Kataria <akataria@vmware.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Ian Campbell 提交于
There's a path in the pagefault code where the kernel deliberately breaks its own locking rules by kmapping a high pte page without holding the pagetable lock (in at least page_check_address). This breaks Xen's ability to track the pinned/unpinned state of the page. There does not appear to be a viable workaround for this behaviour so simply disable HIGHPTE for all Xen guests. Signed-off-by: NIan Campbell <ian.campbell@citrix.com> LKML-Reference: <1267204562-11844-1-git-send-email-ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Pasi Kärkkäinen <pasik@iki.fi> Cc: <stable@kernel.org> # .32.x: 14315592: Allow highmem user page tables to be disabled at boot time Cc: <stable@kernel.org> # .32.x Cc: <xen-devel@lists.xensource.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 26 2月, 2010 1 次提交
-
-
由 Pekka Enberg 提交于
This patch changes the 32-bit version of kernel_physical_mapping_init() to return the last mapped address like the 64-bit one so that we can unify the call-site in init_memory_mapping(). Cc: Yinghai Lu <yinghai@kernel.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi> LKML-Reference: <alpine.DEB.2.00.1002241703570.1180@melkki.cs.helsinki.fi> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 25 2月, 2010 2 次提交
-
-
由 Ian Campbell 提交于
Distros generally (I looked at Debian, RHEL5 and SLES11) seem to enable CONFIG_HIGHPTE for any x86 configuration which has highmem enabled. This means that the overhead applies even to machines which have a fairly modest amount of high memory and which therefore do not really benefit from allocating PTEs in high memory but still pay the price of the additional mapping operations. Running kernbench on a 4G box I found that with CONFIG_HIGHPTE=y but no actual highptes being allocated there was a reduction in system time used from 59.737s to 55.9s. With CONFIG_HIGHPTE=y and highmem PTEs being allocated: Average Optimal load -j 4 Run (std deviation): Elapsed Time 175.396 (0.238914) User Time 515.983 (5.85019) System Time 59.737 (1.26727) Percent CPU 263.8 (71.6796) Context Switches 39989.7 (4672.64) Sleeps 42617.7 (246.307) With CONFIG_HIGHPTE=y but with no highmem PTEs being allocated: Average Optimal load -j 4 Run (std deviation): Elapsed Time 174.278 (0.831968) User Time 515.659 (6.07012) System Time 55.9 (1.07799) Percent CPU 263.8 (71.266) Context Switches 39929.6 (4485.13) Sleeps 42583.7 (373.039) This patch allows the user to control the allocation of PTEs in highmem from the command line ("userpte=nohigh") but retains the status-quo as the default. It is possible that some simple heuristic could be developed which allows auto-tuning of this option however I don't have a sufficiently large machine available to me to perform any particularly meaningful experiments. We could probably handwave up an argument for a threshold at 16G of total RAM. Assuming 768M of lowmem we have 196608 potential lowmem PTE pages. Each page can map 2M of RAM in a PAE-enabled configuration, meaning a maximum of 384G of RAM could potentially be mapped using lowmem PTEs. Even allowing generous factor of 10 to account for other required lowmem allocations, generous slop to account for page sharing (which reduces the total amount of RAM mappable by a given number of PT pages) and other innacuracies in the estimations it would seem that even a 32G machine would not have a particularly pressing need for highmem PTEs. I think 32G could be considered to be at the upper bound of what might be sensible on a 32 bit machine (although I think in practice 64G is still supported). It's seems questionable if HIGHPTE is even a win for any amount of RAM you would sensibly run a 32 bit kernel on rather than going 64 bit. Signed-off-by: NIan Campbell <ian.campbell@citrix.com> LKML-Reference: <1266403090-20162-1-git-send-email-ian.campbell@citrix.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
This will save 64K bytes from memory when loading linux if DMI is disabled, which is good for embedded systems. Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@holoscopio.com> LKML-Reference: <1265758732-19320-1-git-send-email-cascardo@holoscopio.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 18 2月, 2010 2 次提交
-
-
由 Thomas Gleixner 提交于
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
x86/mm is on 32-rc4 and missing the spinlock namespace changes which are needed for further commits into this topic. Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 17 2月, 2010 32 次提交
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6由 Linus Torvalds 提交于
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: serial: 8250: add serial transmitter fully empty test
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6由 Linus Torvalds 提交于
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: USB: gadget: fix EEM gadget CRC usage USB: otg Kconfig: let USB_OTG_UTILS select USB_ULPI option USB: g_multi: fix CONFIG_USB_G_MULTI_RNDIS usage kfifo: Don't use integer as NULL pointer USB: FHCI: Fix build after kfifo rework kfifo: Make kfifo_initialized work after kfifo_free USB: serial: add usbid for dell wwan card to sierra.c USB: SIS USB2VGA DRIVER: support KAIREN's USB VGA adaptor USB20SVGA-MB-PLUS USB: ehci: phy low power mode bug fixing USB: s3c-hsotg: Export usb_gadget_register_driver() USB: r8a66597-udc: Prototype IS_ERR() and PTR_ERR() USB: ftdi_sio: add device IDs (several ELV, one Mindstorms NXT) USB: storage: Remove unneeded SC/PR from unusual_devs.h USB: ftdi_sio: new device id for papouch AD4USB USB: usbfs: properly clean up the as structure on error paths USB: usbfs: only copy the actual data received
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6由 Linus Torvalds 提交于
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: class: Free the class private data in class_release sysfs: sysfs_sd_setattr set iattrs unconditionally
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6由 Linus Torvalds 提交于
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits) be2net: set proper value to version field in req hdr xfrm: Fix xfrm_state_clone leak ipcomp: Avoid duplicate calls to ipcomp_destroy ethtool: allow non-admin user to read GRO settings. ixgbe: fix WOL register setup for 82599 ixgbe: Fix - Do not allow Rx FC on 82598 at 1G due to errata sfc: Fix SFE4002 initialisation mac80211: fix handling of null-rate control in rate_control_get_rate inet: Remove bogus IGMPv3 report handling iwlwifi: fix AMSDU Rx after paged Rx patch tcp: fix ICMP-RTO war via-velocity: Fix races on shared interrupts via-velocity: Take spinlock on set coalesce via-velocity: Remove unused IRQ status parameter from rx_srv and tx_srv rtl8187: Add new device ID iwmc3200wifi: Test of wrong pointer after kzalloc in iwm_mlme_update_bss_table() ath9k: Fix sequence numbers for PAE frames mac80211: fix deferred hardware scan requests iwlwifi: Fix to set correct ht configuration mac80211: Fix probe request filtering in IBSS mode ...
-
由 Dick Hollenbeck 提交于
When controlling an industrial radio modem it can be necessary to manipulate the handshake lines in order to control the radio modem's transmitter, from userspace. The transmitter should not be turned off before all characters have been transmitted. serial8250_tx_empty() was reporting that all characters were transmitted before they actually were. === Discovered in parallel with more testing and analysis by Kees Schoenmakers as follows: I ran into an NetMos 9835 serial pci board which behaves a little different than the standard. This type of expansion board is very common. "Standard" 8250 compatible devices clear the 'UART_LST_TEMT" bit together with the "UART_LSR_THRE" bit when writing data to the device. The NetMos device does it slightly different I believe that the TEMT bit is coupled to the shift register. The problem is that after writing data to the device and very quickly after that one does call serial8250_tx_empty, it returns the wrong information. My patch makes the test more robust (and solves the problem) and it does not affect the already correct devices. Alan: We may yet need to quirk this but now we know which chips we have a way to do that should we find this breaks some other 8250 clone with dodgy THRE. Signed-off-by: NDick Hollenbeck <dick@softplc.com> Signed-off-by: NAlan Cox <alan@linux.intel.com> Cc: Kees Schoenmakers <k.schoenmakers@sigmae.nl> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: stable <stable@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Laurent Pinchart 提交于
Fix a memory leak by freeing the memory allocated in __class_register for the class private data. Signed-off-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com> Cc: stable <stable@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Eric W. Biederman 提交于
There is currently a bug in sysfs_sd_setattr inherited from sysfs_setattr in 2.6.32 where the first time we set the attributes on a sysfs file we allocate backing store but do not set the backing store attributes. Resulting in overly restrictive permissions on sysfs files. The fix is to simply modify the code so that it always executes when we update the sysfs attributes, as we did in 2.6.31 and earlier. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Tested-by: NJean Delvare <khali@linux-fr.org> Cc: stable <stable@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Brian Niebuhr 提交于
eem_wrap() is sending a sentinel CRC, but it didn't indicate that to the host, it should zero bit 14 (bmCRC) in the EEM packet header, instead of setting it. Also remove a redundant crc calculation in eem_unwrap(). Signed-off-by: NSteve Longerbeam <stevel@netspectrum.com> Acked-by: NBrian Niebuhr <bniebuhr@efjohnson.com> Acked-by: NDavid Brownell <dbrownell@users.sourceforge.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Valentin Longchamp 提交于
With CONFIG_USB_ULPI=y, CONFIG_USB<=m, CONFIG_PCI=n and CONFIG_USB_OTG_UTILS=n, which is the default used for mx31moboard, the build for all mx3 platforms fails because drivers/usb/otg/ulpi.c where otg_ulpi_create is defined is not compiled. Build error: arch/arm/mach-mx3/built-in.o: In function `mxc_board_init': kzmarm11.c:(.init.text+0x73c): undefined reference to `otg_ulpi_create' kzmarm11.c:(.init.text+0x1020): undefined reference to `otg_ulpi_create' This isn't a strong dependency as drivers/usb/otg/ulpi.c doesn't use functions defined in drivers/usb/otg/otg.o and is only needed to get ulpi.o linked into the kernel image. Signed-off-by: NValentin Longchamp <valentin.longchamp@epfl.ch> Acked-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Michal Nazarewicz 提交于
g_multi used CONFIG_USB_ETH_RNDIS to check if RNDIS option was requested where it should check for CONFIG_USB_G_MULTI_RNDIS. As a result, RNDIS was never present in g_multi regardless of configuration. This fixes changes made in commit 396cda90. Signed-off-by: NMichal Nazarewicz <m.nazarewicz@samsung.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Anton Vorontsov 提交于
This patch fixes following sparse warnings: include/linux/kfifo.h:127:25: warning: Using plain integer as NULL pointer kernel/kfifo.c:83:21: warning: Using plain integer as NULL pointer Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com> Acked-by: NStefani Seibold <stefani@seibold.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Anton Vorontsov 提交于
After kfifo rework FHCI fails to build: CC drivers/usb/host/fhci-tds.o drivers/usb/host/fhci-tds.c: In function 'fhci_ep0_free': drivers/usb/host/fhci-tds.c:108: error: used struct type value where scalar is required drivers/usb/host/fhci-tds.c:118: error: used struct type value where scalar is required drivers/usb/host/fhci-tds.c:128: error: used struct type value where scalar is required This is because kfifos are no longer pointers in the ep struct. So, instead of checking the pointers, we should now check if kfifo is initialized. Reported-by: NJosh Boyer <jwboyer@gmail.com> Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com> Acked-by: NStefani Seibold <stefani@seibold.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Anton Vorontsov 提交于
After kfifo rework it's no longer possible to reliably know if kfifo is usable, since after kfifo_free(), kfifo_initialized() would still return true. The correct behaviour is needed for at least FHCI USB driver. This patch fixes the issue by resetting the kfifo to zero values (the same approach is used in kfifo_alloc() if allocation failed). Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com> Acked-by: NStefani Seibold <stefani@seibold.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Richard Farina 提交于
This patch adds support for Dell Computer Corp. Wireless 5720 VZW Mobile Broadband (EVDO Rev-A) Minicard GPS Port. I stole the name from lsusb, but my card does not have a GPS on it (at least not that I can make function). I'm sure the patch is whitespace damaged but the one line addition should be fairly straightforward nonetheless. Tested-by: NRick Farina <sidhayn@gmail.com> Signed-off-by: NRick Farina <sidhayn@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Tanaka Akira 提交于
This patch adds the USB product ID of KAIREN's USB VGA Adaptor, USB20SVGA-MB-PLUS, to sisusbvga work with it. Signed-off-by: NTanaka Akira <akr@fsij.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Alek Du 提交于
1. There are two msleep calls inside two spin lock sections, need to unlock and lock again after msleep. 2. Save a extra status reg setting. Signed-off-by: NAlek Du <alek.du@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Mark Brown 提交于
USB gadget controller drivers normally export their driver registration function, allowing modular builds of the individual gadget drivers so do so for s3c-hsotg, fixing builds. Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Mark Brown 提交于
The build of r8a66597-udc was failing on ARM since IS_ERR() and PTR_ERR() weren't protyped. Presumably err.h is being pulled in by another header on other platforms. Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: NYoshihiro Shimoda <shimoda.yoshihiro@renesas.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Andreas Mohr 提交于
- add FTDI device IDs for several ELV devices and NXTCam of Lego Mindstorms NXT - add hopefully helpful new_id comment - remove less helpful "Due to many user requests for multiple ELV devices we enable them by default." comment (we simply add _all_ known devices - an enduser shouldn't have to fiddle with obscure module parameters...). - add myself to DRIVER_AUTHOR The missing NXTCam ID has been found at http://www.unixboard.de/vb3/showthread.php?t=44155 , ELV devices taken from ELV Windows .inf file. Signed-off-by: NAndreas Mohr <andi@lisas.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Phil Dibowitz 提交于
This patch removes the subclass and protocol entries from a Microtech entry in unusual_devs.h. This was reported by <ryck@pacbell.net>. Greg, please apply. Signed-off-by: NPhil Dibowitz <phil@ipom.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Radek Liboska 提交于
added new device pid (PAPOUCH_AD4USB_PID) to ftdi_sio.h and ftdi_sio.c AD4USB measuring converter is a 4-input A/D converter which enables the user to measure to four current inputs ranging from 0(4) to 20 mA or voltage between 0 and 10 V. The measured values are then transferred to a superior system in digital form. The AD4USB communicates via USB. Powered is also via USB. datasheet in english is here: http://www.papouch.com/shop/scripts/pdf/ad4usb_en.pdfSigned-off-by: NRadek Liboska <liboska@uochb.cas.cz>
-
由 Linus Torvalds 提交于
I notice that the processcompl_compat() function seems to be leaking the 'struct async *as' in the error paths. I think that the calling convention is fundamentally buggered. The caller is the one that did the "reap_as()" to get the as thing, the caller should be the one to free it too. Freeing it in the caller also means that it very clearly always gets freed, and avoids the need for any "free in the error case too". From: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Marcus Meissner <meissner@suse.de> Cc: stable <stable@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Greg KH 提交于
We need to only copy the data received by the device to userspace, not the whole kernel buffer, which can contain "stale" data. Thanks to Marcus Meissner for pointing this out and testing the fix. Reported-by: NMarcus Meissner <meissner@suse.de> Tested-by: NMarcus Meissner <meissner@suse.de> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable <stable@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Ajit Khaparde 提交于
Before sending a command to the ASIC, set version properly. This is necessary for the ARM firmware to send correct data to the driver. This also fixes a bug in certain skews of the ASIC where the statistics are misreported. Signed-off-by: NAjit Khaparde <ajitk@serverengines.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
xfrm_state_clone calls kfree instead of xfrm_state_put to free a failed state. Depending on the state of the failed state, it can cause leaks to things like module references. All states should be freed by xfrm_state_put past the point of xfrm_init_state. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
When ipcomp_tunnel_attach fails we will call ipcomp_destroy twice. This may lead to double-frees on certain structures. As there is no reason to explicitly call ipcomp_destroy, this patch removes it from ipcomp*.c and lets the standard xfrm_state destruction take place. This is based on the discovery and patch by Alexey Dobriyan. Tested-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 stephen hemminger 提交于
Looks like an oversight in GRO design. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm由 Linus Torvalds 提交于
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm: sysfs revert add empty release function to avoid debug warning dm mpath: fix stall when requeueing io dm raid1: fix null pointer dereference in suspend dm raid1: fail writes if errors are not handled and log fails dm log: userspace fix overhead_size calcuations dm snapshot: persistent annotate work_queue as on stack dm stripe: avoid divide by zero with invalid stripe count
-
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6由 Linus Torvalds 提交于
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] preserve personality flag bits across exec
-
由 Alasdair G Kergon 提交于
Revert commit d2bb7df8 at Greg's request. Author: Milan Broz <mbroz@redhat.com> Date: Thu Dec 10 23:51:53 2009 +0000 dm: sysfs add empty release function to avoid debug warning This patch just removes an unnecessary warning: kobject: 'dm': does not have a release() function, it is broken and must be fixed. The kobject is embedded in mapped device struct, so code does not need to release memory explicitly here. Cc: Greg KH <gregkh@suse.de> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Kiyoshi Ueda 提交于
This patch fixes the problem that system may stall if target's ->map_rq returns DM_MAPIO_REQUEUE in map_request(). E.g. stall happens on 1 CPU box when a dm-mpath device with queue_if_no_path bounces between all-paths-down and paths-up on I/O load. When target's ->map_rq returns DM_MAPIO_REQUEUE, map_request() requeues the request and returns to dm_request_fn(). Then, dm_request_fn() doesn't exit the I/O dispatching loop and continues processing the requeued request again. This map and requeue loop can be done with interrupt disabled, so 1 CPU system can be stalled if this situation happens. For example, commands below can stall my 1 CPU box within 1 minute or so: # dmsetup table mp mp: 0 2097152 multipath 1 queue_if_no_path 0 1 1 service-time 0 1 2 8:144 1 1 # while true; do dd if=/dev/mapper/mp of=/dev/null bs=1M count=100; done & # while true; do \ > dmsetup message mp 0 "fail_path 8:144" \ > dmsetup suspend --noflush mp \ > dmsetup resume mp \ > dmsetup message mp 0 "reinstate_path 8:144" \ > done To fix the problem above, this patch changes dm_request_fn() to exit the I/O dispatching loop once if a request is requeued in map_request(). Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: stable@kernel.org Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-
由 Takahiro Yasui 提交于
When suspending a failed mirror, bios are completed by mirror_end_io() and __rh_lookup() in dm_rh_dec() returns NULL where a non-NULL return value is required by design. Fix this by not changing the state of the recovery failed region from DM_RH_RECOVERING to DM_RH_NOSYNC in dm_rh_recovery_end(). Issue On 2.6.33-rc1 kernel, I hit the bug when I suspended the failed mirror by dmsetup command. BUG: unable to handle kernel NULL pointer dereference at 00000020 IP: [<f94f38e2>] dm_rh_dec+0x35/0xa1 [dm_region_hash] ... EIP: 0060:[<f94f38e2>] EFLAGS: 00010046 CPU: 0 EIP is at dm_rh_dec+0x35/0xa1 [dm_region_hash] EAX: 00000286 EBX: 00000000 ECX: 00000286 EDX: 00000000 ESI: eff79eac EDI: eff79e80 EBP: f6915cd4 ESP: f6915cc4 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process dmsetup (pid: 2849, ti=f6914000 task=eff03e80 task.ti=f6914000) ... Call Trace: [<f9530af6>] ? mirror_end_io+0x53/0x1b1 [dm_mirror] [<f9413104>] ? clone_endio+0x4d/0xa2 [dm_mod] [<f9530aa3>] ? mirror_end_io+0x0/0x1b1 [dm_mirror] [<f94130b7>] ? clone_endio+0x0/0xa2 [dm_mod] [<c02d6bcb>] ? bio_endio+0x28/0x2b [<f952f303>] ? hold_bio+0x2d/0x62 [dm_mirror] [<f952f942>] ? mirror_presuspend+0xeb/0xf7 [dm_mirror] [<c02aa3e2>] ? vmap_page_range+0xb/0xd [<f9414c8d>] ? suspend_targets+0x2d/0x3b [dm_mod] [<f9414ca9>] ? dm_table_presuspend_targets+0xe/0x10 [dm_mod] [<f941456f>] ? dm_suspend+0x4d/0x150 [dm_mod] [<f941767d>] ? dev_suspend+0x55/0x18a [dm_mod] [<c0343762>] ? _copy_from_user+0x42/0x56 [<f9417fb0>] ? dm_ctl_ioctl+0x22c/0x281 [dm_mod] [<f9417628>] ? dev_suspend+0x0/0x18a [dm_mod] [<f9417d84>] ? dm_ctl_ioctl+0x0/0x281 [dm_mod] [<c02c3c4b>] ? vfs_ioctl+0x22/0x85 [<c02c422c>] ? do_vfs_ioctl+0x4cb/0x516 [<c02c42b7>] ? sys_ioctl+0x40/0x5a [<c0202858>] ? sysenter_do_call+0x12/0x28 Analysis When recovery process of a region failed, dm_rh_recovery_end() function changes the state of the region from RM_RH_RECOVERING to DM_RH_NOSYNC. When recovery_complete() is executed between dm_rh_update_states() and dm_writes() in do_mirror(), bios are processed with the region state, DM_RH_NOSYNC. However, the region data is freed without checking its pending count when dm_rh_update_states() is called next time. When bios are finished by mirror_end_io(), __rh_lookup() in dm_rh_dec() returns NULL even though a valid return value are expected. Solution Remove the state change of the recovery failed region from DM_RH_RECOVERING to DM_RH_NOSYNC in dm_rh_recovery_end(). We can remove the state change because: - If the region data has been released by dm_rh_update_states(), a new region data is created with the state of DM_RH_NOSYNC, and bios are processed according to the DM_RH_NOSYNC state. - If the region data has not been released by dm_rh_update_states(), a state of the region is DM_RH_RECOVERING and bios are put in the delayed_bio list. The flag change from DM_RH_RECOVERING to DM_RH_NOSYNC in dm_rh_recovery_end() was added in the following commit: dm raid1: handle resync failures author Jonathan Brassow <jbrassow@redhat.com> Thu, 12 Jul 2007 16:29:04 +0000 (17:29 +0100) http://git.kernel.org/linus/f44db678edcc6f4c2779ac43f63f0b9dfa28b724Signed-off-by: NTakahiro Yasui <tyasui@redhat.com> Reviewed-by: NMikulas Patocka <mpatocka@redhat.com> Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
-