- 14 12月, 2011 6 次提交
-
-
由 Eddie Wai 提交于
During session recovery, the conn_stop call will trigger a flush to all outstanding SCSI cmds in the xmit queue. This will set all outstanding task->sc to NULL prior to the session_teardown call which frees the task memory. In the bnx2i SCSI response processing path, only the task was being checked for NULL under the session lock before the task->sc->request dereferencing. If there are outstanding SCSI cmd responses pending for process, the following kernel panic can be exposed where task->sc was found to be NULL. Call Trace: [ 69.720205] [<ffffffffa040d0d0>] bnx2i_process_new_cqes+0x290/0x3c0 [bnx2i] [ 69.804289] [<ffffffffa040d233>] bnx2i_fastpath_notification+0x33/0xa0 [bnx2 i] [ 69.891490] [<ffffffffa040d37b>] bnx2i_indicate_kcqe+0xdb/0x330 [bnx2i] [ 69.971427] [<ffffffffa03eac5e>] service_kcqes+0x16e/0x1d0 [cnic] [ 70.045132] [<ffffffffa03eacea>] cnic_service_bnx2x_kcq+0x2a/0x50 [cnic] [ 70.126105] [<ffffffffa03ead53>] cnic_service_bnx2x_bh+0x43/0x140 [cnic] [ 70.207081] [<ffffffff81060676>] tasklet_action+0x66/0x110 [ 70.273521] [<ffffffff8106025f>] __do_softirq+0xef/0x220 [ 70.337887] [<ffffffff81447ebc>] call_softirq+0x1c/0x30 This patch adds the !task->sc check and also protects the sc dereferencing under the session lock. Signed-off-by: NEddie Wai <eddie.wai@broadcom.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Mike Christie 提交于
iscsi_conn_setup can fail so we must check for NULL being returned. Signed-off-by: NMike Christie <michaelc@cs.wisc.edu> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Tomas Henzl 提交于
When the qla4xxx_get_fwddb_entry returns QLA_ERROR the nex_idx is not updated, for (idx = 0; idx < max_ddbs; idx = next_idx) { ret = qla4xxx_get_fwddb_entry(ha, idx, NULL, 0, NULL, &next_idx, &state, &conn_err, NULL, NULL); if (ret == QLA_ERROR) continue; This means there is a risk that the 'idx < max_ddbs' condition will never met and the loop will loop forever. Fix this by explicitly increasing the next_idx in the error condition. Maybe a break instead of continue is more appropriate, leaving the decision on the qlogic maintainer. Signed-off-by: NTomas Henzl <thenzl@redhat.com> Signed-off-by: NMike Christie <michaelc@cs.wisc.edu> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Mike Christie 提交于
With open-iscsi support, target entries persisted in the FLASH were not login. Added support in the qla4xxx driver to do the login on probe time to the target entries saved in the FLASH by user. With this changes upgrade to the new kernel with open-iscsi support in qla4xxx will ensure users original target entries login on driver load Signed-off-by: NManish Rangankar <manish.rangankar@qlogic.com> Signed-off-by: NRavi Anand <ravi.anand@qlogic.com> Signed-off-by: NMike Christie <michaelc@cs.wisc.edu> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Steffen Maier 提交于
zfcp_scsi_slave_destroy erroneously always tried to finish its task even if the corresponding previous zfcp_scsi_slave_alloc returned early. This can lead to kernel page faults on accessing uninitialized fields of struct zfcp_scsi_dev in zfcp_erp_lun_shutdown_wait. Take the port field of the struct to determine if slave_alloc returned early. This zfcp bug is exposed by 4e6c82b3 (in turn fixing f7c9c6bb to be compatible with 21208ae5) which can call slave_destroy for a corresponding previous slave_alloc that did not finish. This patch is based on James Bottomley's fix suggestion in http://www.spinics.net/lists/linux-scsi/msg55449.html. Signed-off-by: NSteffen Maier <maier@linux.vnet.ibm.com> Cc: <stable@kernel.org> #2.6.38+ Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Thomas Gleixner 提交于
The error exit path leaks preempt count. Add the missing put_cpu(). Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NYi Zou <yi.zou@intel.com> Cc: stable@kernel.org Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
- 12 12月, 2011 15 次提交
-
-
由 Chad Dupuis 提交于
Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Giridhar Malavali 提交于
Signed-off-by: NGiridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Saurav Kashyap 提交于
[jejb: checkpatch fixes] Add more fine grain parsing of vha->loop_state to export a more accurate fc_host port_state. Signed-off-by: NSaurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Giridhar Malavali 提交于
Signed-off-by: NGiridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Giridhar Malavali 提交于
Signed-off-by: NGiridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Chad Dupuis 提交于
Signed-off-by: NGiridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Chad Dupuis 提交于
[jejb: fixup checkpatch error] Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Andrew Vasquez 提交于
We need to return QLA_FUNCTION_TIMEOUT immediately otherwise we mess up the mailbox command state machine. Signed-off-by: NAndrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Giridhar Malavali 提交于
Signed-off-by: NGiridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Giridhar Malavali 提交于
[SCSI] qla2xxx: Stop unconditional completion of mailbox commands issued in interrupt mode during firmware hang. Signed-off-by: NGiridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Giridhar Malavali 提交于
If there is an error creating multiple response queues then we need to revert the request queue mapping back to request queue 0. Signed-off-by: NGiridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: NAndrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Saurav Kashyap 提交于
Signed-off-by: NSaurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Arun Easi 提交于
Signed-off-by: NArun Easi <arun.easi@qlogic.com> Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Saurav Kashyap 提交于
This function can wait for 5min under certain scenarios. One of them is when the port is down from switch and bus reset is issued. The bus reset used to wait for 5 minutes for the loop and upper layer callers used to hang and give stack trace because of getting stuck for 120 sec. It is legacy code that was used when the driver used to do queuing of the commands. Signed-off-by: NSaurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: NChad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
由 Anton Blanchard 提交于
_scsih_smart_predicted_fault is called in an interrupt and therefore must allocate memory using GFP_ATOMIC. Signed-off-by: NAnton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
-
- 10 12月, 2011 9 次提交
-
-
由 Linus Torvalds 提交于
-
git://git.samba.org/sfrench/cifs-2.6由 Linus Torvalds 提交于
* git://git.samba.org/sfrench/cifs-2.6: cifs: check for NULL last_entry before calling cifs_save_resume_key cifs: attempt to freeze while looping on a receive attempt cifs: Fix sparse warning when calling cifs_strtoUCS CIFS: Add descriptions to the brlock cache functions
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, efi: Calling __pa() with an ioremap()ed address is invalid x86, hpet: Immediately disable HPET timer 1 if rtc irq is masked x86/intel_mid: Kconfig select fix x86/intel_mid: Fix the Kconfig for MID selection
-
git://git.pengutronix.de/git/wsa/linux-2.6由 Linus Torvalds 提交于
* 'spi/for-3.2' of git://git.pengutronix.de/git/wsa/linux-2.6: spi/gpio: fix section mismatch warning spi/fsl-espi: disable CONFIG_SPI_FSL_ESPI=m build spi/nuc900: Include linux/module.h spi/ath79: fix compile error due to missing include
-
git://neil.brown.name/md由 Linus Torvalds 提交于
* 'for-linus' of git://neil.brown.name/md: md: raid5 crash during degradation md/raid5: never wait for bad-block acks on failed device. md: ensure new badblocks are handled promptly. md: bad blocks shouldn't cause a Blocked status on a Faulty device. md: take a reference to mddev during sysfs access. md: refine interpretation of "hold_active == UNTIL_IOCTL". md/lock: ensure updates to page_attrs are properly locked.
-
git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile由 Linus Torvalds 提交于
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: arch/tile: use new generic {enable,disable}_percpu_irq() routines drivers/net/ethernet/tile: use skb_frag_page() API asm-generic/unistd.h: support new process_vm_{readv,write} syscalls arch/tile: fix double-free bug in homecache_free_pages() arch/tile: add a few #includes and an EXPORT to catch up with kernel changes.
-
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu由 Linus Torvalds 提交于
* 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: MAINTAINERS: Update amd-iommu F: patterns iommu/amd: Fix typo in kernel-parameters.txt iommu/msm: Fix compile error in mach-msm/devices-iommu.c Fix comparison using wrong pointer variable in dma debug code
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound由 Linus Torvalds 提交于
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek - Fix lost speaker volume controls ALSA: hda/realtek - Create "Bass Speaker" for two speaker pins ALSA: hda/realtek - Don't create extra controls with channel suffix ALSA: hda - Fix remaining VREF mute-LED NID check in post-3.1 changes ALSA: hda - Fix GPIO LED setup for IDT 92HD75 codecs ASoC: Provide a more complete DMA driver stub ASoC: Remove references to corgi and spitz from machine driver document ASoC: Make SND_SOC_MX27VIS_AIC32X4 depend on I2C ASoC: Fix dependency for SND_SOC_RAUMFELD and SND_PXA2XX_SOC_HX4700 ASoC: uda1380: Return proper error in uda1380_modinit failure path ASoC: kirkwood: Make SND_KIRKWOOD_SOC_OPENRD and SND_KIRKWOOD_SOC_T5325 depend on I2C ASoC: Mark WM8994 ADC muxes as virtual ALSA: hda/realtek - Fix Oops in alc_mux_select() ALSA: sis7019 - give slow codecs more time to reset
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip由 Linus Torvalds 提交于
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Do no try to schedule task events if there are none lockdep, kmemcheck: Annotate ->lock in lockdep_init_map() perf header: Use event_name() to get an event name perf stat: Failure with "Operation not supported"
-
- 09 12月, 2011 10 次提交
-
-
由 Mandeep Singh Baines 提交于
In order to safely dereference current->real_parent inside an rcu_read_lock, we need an rcu_dereference. Signed-off-by: NMandeep Singh Baines <msb@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexandre Bounine 提交于
Modify initialization of PCIe capability registers in Tsi721 mport driver: - change Completion Timeout value to avoid unexpected data transfer aborts during intensive traffic. - replace hardcoded offset of PCIe capability block by making it use the common function. This patch is applicable to kernel versions starting from 3.2-rc1. Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexandre Bounine 提交于
Bug fix for Tsi721 RapidIO mport driver: Tsi721 supports four RapidIO mailboxes (MBOX0 - MBOX3) as defined by RapidIO specification. Mailbox resources has to be properly reported to allow use of all available mailboxes (initial version reports only MBOX0). This patch is applicable to kernel versions staring from 3.2-rc1. Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexandre Bounine 提交于
Replace the pair dma_alloc_coherent()+memset() with the new dma_zalloc_coherent() added by Andrew Morton for kernel version 3.2 Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
Since commit a25cac51 ("proc: Consider NO_HZ when printing idle and iowait times") we are reporting idle/io_wait time also while a CPU is tickless. We rely on get_{idle,iowait}_time functions to retrieve proper data. These functions, however, use usecs_to_cputime to translate micro seconds time to cputime64_t. This is just an alias to usecs_to_jiffies which reduces the data type from u64 to unsigned int and also checks whether the given parameter overflows jiffies_to_usecs(MAX_JIFFY_OFFSET) and returns MAX_JIFFY_OFFSET in that case. When we overflow depends on CONFIG_HZ but especially for CONFIG_HZ_300 it is quite low (1431649781) so we are getting MAX_JIFFY_OFFSET for >3000s! until we overflow unsigned int. Just for reference CONFIG_HZ_100 has an overflow window around 20s, CONFIG_HZ_250 ~8s and CONFIG_HZ_1000 ~2s. This results in a bug when people saw [h]top going mad reporting 100% CPU usage even though there was basically no CPU load. The reason was simply that /proc/stat stopped reporting idle/io_wait changes (and reported MAX_JIFFY_OFFSET) and so the only change happening was for user system time. Let's use nsecs_to_jiffies64 instead which doesn't reduce the precision to 32b type and it is much more appropriate for cumulative time values (unlike usecs_to_jiffies which intended for timeout calculations). Signed-off-by: NMichal Hocko <mhocko@suse.cz> Tested-by: NArtem S. Tashkinov <t.artem@mailcity.com> Cc: Dave Jones <davej@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
Commit f5252e00 ("mm: avoid null pointer access in vm_struct via /proc/vmallocinfo") adds newly allocated vm_structs to the vmlist after it is fully initialised. Unfortunately, it did not check that __vmalloc_area_node() successfully populated the area. In the event of allocation failure, the vmalloc area is freed but the pointer to freed memory is inserted into the vmlist leading to a a crash later in get_vmalloc_info(). This patch adds a check for ____vmalloc_area_node() failure within __vmalloc_node_range. It does not use "goto fail" as in the previous error path as a warning was already displayed by __vmalloc_area_node() before it called vfree in its failure path. Credit goes to Luciano Chavez for doing all the real work of identifying exactly where the problem was. Signed-off-by: NMel Gorman <mgorman@suse.de> Reported-by: NLuciano Chavez <lnx1138@linux.vnet.ibm.com> Tested-by: NLuciano Chavez <lnx1138@linux.vnet.ibm.com> Reviewed-by: NRik van Riel <riel@redhat.com> Acked-by: NDavid Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> [3.1.x+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
setup_zone_migrate_reserve() expects that zone->start_pfn starts at pageblock_nr_pages aligned pfn otherwise we could access beyond an existing memblock resulting in the following panic if CONFIG_HOLES_IN_ZONE is not configured and we do not check pfn_valid: IP: [<c02d331d>] setup_zone_migrate_reserve+0xcd/0x180 *pdpt = 0000000000000000 *pde = f000ff53f000ff53 Oops: 0000 [#1] SMP Pid: 1, comm: swapper Not tainted 3.0.7-0.7-pae #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform EIP: 0060:[<c02d331d>] EFLAGS: 00010006 CPU: 0 EIP is at setup_zone_migrate_reserve+0xcd/0x180 EAX: 000c0000 EBX: f5801fc0 ECX: 000c0000 EDX: 00000000 ESI: 000c01fe EDI: 000c01fe EBP: 00140000 ESP: f2475f58 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process swapper (pid: 1, ti=f2474000 task=f2472cd0 task.ti=f2474000) Call Trace: [<c02d389c>] __setup_per_zone_wmarks+0xec/0x160 [<c02d3a1f>] setup_per_zone_wmarks+0xf/0x20 [<c08a771c>] init_per_zone_wmark_min+0x27/0x86 [<c020111b>] do_one_initcall+0x2b/0x160 [<c086639d>] kernel_init+0xbe/0x157 [<c05cae26>] kernel_thread_helper+0x6/0xd Code: a5 39 f5 89 f7 0f 46 fd 39 cf 76 40 8b 03 f6 c4 08 74 32 eb 91 90 89 c8 c1 e8 0e 0f be 80 80 2f 86 c0 8b 14 85 60 2f 86 c0 89 c8 <2b> 82 b4 12 00 00 c1 e0 05 03 82 ac 12 00 00 8b 00 f6 c4 08 0f EIP: [<c02d331d>] setup_zone_migrate_reserve+0xcd/0x180 SS:ESP 0068:f2475f58 CR2: 00000000000012b4 We crashed in pageblock_is_reserved() when accessing pfn 0xc0000 because highstart_pfn = 0x36ffe. The issue was introduced in 3.0-rc1 by 6d3163ce ("mm: check if any page in a pageblock is reserved before marking it MIGRATE_RESERVE"). Make sure that start_pfn is always aligned to pageblock_nr_pages to ensure that pfn_valid s always called at the start of each pageblock. Architectures with holes in pageblocks will be correctly handled by pfn_valid_within in pageblock_is_reserved. Signed-off-by: NMichal Hocko <mhocko@suse.cz> Signed-off-by: NMel Gorman <mgorman@suse.de> Tested-by: NDang Bo <bdang@vmware.com> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Arve Hjnnevg <arve@android.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> [3.0+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hillf Danton 提交于
Avoid unlocking and unlocked page if we failed to lock it. Signed-off-by: NHillf Danton <dhillf@gmail.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Youquan Song 提交于
Commit 70b50f94 ("mm: thp: tail page refcounting fix") keeps all page_tail->_count zero at all times. But the current kernel does not set page_tail->_count to zero if a 1GB page is utilized. So when an IOMMU 1GB page is used by KVM, it wil result in a kernel oops because a tail page's _count does not equal zero. kernel BUG at include/linux/mm.h:386! invalid opcode: 0000 [#1] SMP Call Trace: gup_pud_range+0xb8/0x19d get_user_pages_fast+0xcb/0x192 ? trace_hardirqs_off+0xd/0xf hva_to_pfn+0x119/0x2f2 gfn_to_pfn_memslot+0x2c/0x2e kvm_iommu_map_pages+0xfd/0x1c1 kvm_iommu_map_memslots+0x7c/0xbd kvm_iommu_map_guest+0xaa/0xbf kvm_vm_ioctl_assigned_device+0x2ef/0xa47 kvm_vm_ioctl+0x36c/0x3a2 do_vfs_ioctl+0x49e/0x4e4 sys_ioctl+0x5a/0x7c system_call_fastpath+0x16/0x1b RIP gup_huge_pud+0xf2/0x159 Signed-off-by: NYouquan Song <youquan.song@intel.com> Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Youquan Song 提交于
With the 3.2-rc kernel, IOMMU 2M pages in KVM works. But when I tried to use IOMMU 1GB pages in KVM, I encountered an oops and the 1GB page failed to be used. The root cause is that 1GB page allocation calls gup_huge_pud() while 2M page calls gup_huge_pmd. If compound pages are used and the page is a tail page, gup_huge_pmd() increases _mapcount to record tail page are mapped while gup_huge_pud does not do that. So when the mapped page is relesed, it will result in kernel oops because the page is not marked mapped. This patch add tail process for compound page in 1GB huge page which keeps the same process as 2M page. Reproduce like: 1. Add grub boot option: hugepagesz=1G hugepages=8 2. mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages 3. qemu-kvm -m 2048 -hda os-kvm.img -cpu kvm64 -smp 4 -mem-path /dev/hugepages -net none -device pci-assign,host=07:00.1 kernel BUG at mm/swap.c:114! invalid opcode: 0000 [#1] SMP Call Trace: put_page+0x15/0x37 kvm_release_pfn_clean+0x31/0x36 kvm_iommu_put_pages+0x94/0xb1 kvm_iommu_unmap_memslots+0x80/0xb6 kvm_assign_device+0xba/0x117 kvm_vm_ioctl_assigned_device+0x301/0xa47 kvm_vm_ioctl+0x36c/0x3a2 do_vfs_ioctl+0x49e/0x4e4 sys_ioctl+0x5a/0x7c system_call_fastpath+0x16/0x1b RIP put_compound_page+0xd4/0x168 Signed-off-by: NYouquan Song <youquan.song@intel.com> Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-