- 05 4月, 2008 18 次提交
-
-
由 Balbir Singh 提交于
A boot option for the memory controller was discussed on lkml. It is a good idea to add it, since it saves memory for people who want to turn off the memory controller. By default the option is on for the following two reasons: 1. It provides compatibility with the current scheme where the memory controller turns on if the config option is enabled 2. It allows for wider testing of the memory controller, once the config option is enabled We still allow the create, destroy callbacks to succeed, since they are not aware of boot options. We do not populate the directory will memory resource controller specific files. Signed-off-by: NBalbir Singh <balbir@linux.vnet.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Menage 提交于
The effects of cgroup_disable=foo are: - foo isn't auto-mounted if you mount all cgroups in a single hierarchy - foo isn't visible as an individually mountable subsystem As a result there will only ever be one call to foo->create(), at init time; all processes will stay in this group, and the group will never be mounted on a visible hierarchy. Any additional effects (e.g. not allocating metadata) are up to the foo subsystem. This doesn't handle early_init subsystems (their "disabled" bit isn't set be, but it could easily be extended to do so if any of the early_init systems wanted it - I think it would just involve some nastier parameter processing since it would occur before the command-line argument parser had been run. Hugh said: Ballpark figures, I'm trying to get this question out rather than processing the exact numbers: CONFIG_CGROUP_MEM_RES_CTLR adds 15% overhead to the affected paths, booting with cgroup_disable=memory cuts that back to 1% overhead (due to slightly bigger struct page). I'm no expert on distros, they may have no interest whatever in CONFIG_CGROUP_MEM_RES_CTLR=y; and the rest of us can easily build with or without it, or apply the cgroup_disable=memory patches. Unix bench's execl test result on x86_64 was == just after boot without mounting any cgroup fs.== mem_cgorup=off : Execl Throughput 43.0 3150.1 732.6 mem_cgroup=on : Execl Throughput 43.0 2932.6 682.0 == [lizf@cn.fujitsu.com: fix boot option parsing] Signed-off-by: NBalbir Singh <balbir@linux.vnet.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86由 Linus Torvalds 提交于
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: x86: revert assign IRQs to hpet timer x86: tsc prevent time going backwards xen: Clear PG_pinned in release_{pt,pd}() xen: Do not pin/unpin PMD pages xen: refactor xen_{alloc,release}_{pt,pd}() x86, agpgart: scary messages are fortunately obsolete xen: fix grant table bug x86: fix breakage of vSMP irq operations x86: print message if nmi_watchdog=2 cannot be enabled x86: fix nmi_watchdog=2 on Pentium-D CPUs
-
由 Geert Uytterhoeven 提交于
Long overdue update of the m68k defconfigs Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Adrian Bunk 提交于
The default defconfig should be one from arch/m68k/configs/ arch/m68k/defconfig was not exactly identical to amiga_defconfig but also considering how long they have been without any update that doesn't seem to have been on purpose. Signed-off-by: NAdrian Bunk <adrian.bunk@movial.fi> Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev由 Linus Torvalds 提交于
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: pata_ali: disable ATAPI DMA libata: ATA_12/16 doesn't fall into ATAPI_MISC libata: uninline atapi_cmd_type() libata: fix IDENTIFY order in ata_bus_probe()
-
由 Linus Torvalds 提交于
Mikulas Patocka noted that the optimization where we check if a buffer was already dirty (and we avoid re-dirtying it) was not really SMP-safe. Since the read of the old status was not synchronized with anything, an aggressive CPU re-ordering of memory accesses might have moved that read up to before the data was even written to the buffer, and another CPU that cleaned it again, causing the newly dirty state to never actually hit the disk. Admittedly this would probably never trigger in practice, but it's still wrong. Mikulas sent a patch that fixed the problem, but I dislike the subtlety of the whole optimization, so this is an alternate fix that is more explicit about the particular SMP ordering for the optimization, and separates out the speculative reads of the buffer state into its own conditional (and makes the memory barrier only happen if we are likely to actually hit the optimized case in the first place). I considered removing the optimization entirely, but Andrew argued for it's continued existence. I'm a push-over. Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
Commit f63fd7e2 ("parport_pc: detection for SuperIO IT87XX POST") only released the IO port region on success, not when the probe for the IT87XX chip failed. That caused not only a reserved region to leak, but also caused an oops when the driver module was unloaded and somebody tried to cat /proc/ioports - because the string that was assigned to the IO port region was a static string in the module virtual address area. Reported-by: NLubos Lunak <l.lunak@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Petr Cvek <petr.cvek@tul.cz> Acked-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Thomas Gleixner 提交于
The commits: commit 37a47db8 Author: Balaji Rao <balajirrao@gmail.com> Date: Wed Jan 30 13:30:03 2008 +0100 x86: assign IRQs to HPET timers, fix and commit e3f37a54 Author: Balaji Rao <balajirrao@gmail.com> Date: Wed Jan 30 13:30:03 2008 +0100 x86: assign IRQs to HPET timers have been identified to cause a regression on some platforms due to the assignement of legacy IRQs which makes the legacy devices connected to those IRQs disfunctional. Revert them. This fixes http://bugzilla.kernel.org/show_bug.cgi?id=10382Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Thomas Gleixner 提交于
We already catch most of the TSC problems by sanity checks, but there is a subtle bug which has been in the code for ever. This can cause time jumps in the range of hours. This was reported in: http://lkml.org/lkml/2007/8/23/96 and http://lkml.org/lkml/2008/3/31/23 I was able to reproduce the problem with a gettimeofday loop test on a dual core and a quad core machine which both have sychronized TSCs. The TSCs seems not to be perfectly in sync though, but the kernel is not able to detect the slight delta in the sync check. Still there exists an extremly small window where this delta can be observed with a real big time jump. So far I was only able to reproduce this with the vsyscall gettimeofday implementation, but in theory this might be observable with the syscall based version as well. CPU 0 updates the clock source variables under xtime/vyscall lock and CPU1, where the TSC is slighty behind CPU0, is reading the time right after the seqlock was unlocked. The clocksource reference data was updated with the TSC from CPU0 and the value which is read from TSC on CPU1 is less than the reference data. This results in a huge delta value due to the unsigned subtraction of the TSC value and the reference value. This algorithm can not be changed due to the support of wrapping clock sources like pm timer. The huge delta is converted to nanoseconds and added to xtime, which is then observable by the caller. The next gettimeofday call on CPU1 will show the correct time again as now the TSC has advanced above the reference value. To prevent this TSC specific wreckage we need to compare the TSC value against the reference value and return the latter when it is larger than the actual TSC value. I pondered to mark the TSC unstable when the readout is smaller than the reference value, but this would render an otherwise good and fast clocksource unusable without a real good reason. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mark McLoughlin 提交于
Signed-off-by: NMark McLoughlin <markmc@redhat.com> Cc: xen-devel@lists.xensource.com Cc: Mark McLoughlin <markmc@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mark McLoughlin 提交于
i.e. with this simple test case: int fd = open("/dev/zero", O_RDONLY); munmap(mmap((void *)0x40000000, 0x1000_LEN, PROT_READ, MAP_PRIVATE, fd, 0), 0x1000); close(fd); we currently get: kernel BUG at arch/x86/xen/enlighten.c:678! ... EIP is at xen_release_pt+0x79/0xa9 ... Call Trace: [<c041da25>] ? __pmd_free_tlb+0x1a/0x75 [<c047a192>] ? free_pgd_range+0x1d2/0x2b5 [<c047a2f3>] ? free_pgtables+0x7e/0x93 [<c047b272>] ? unmap_region+0xb9/0xf5 [<c047c1bd>] ? do_munmap+0x193/0x1f5 [<c047c24f>] ? sys_munmap+0x30/0x3f [<c0408cce>] ? syscall_call+0x7/0xb ======================= and xen complains: (XEN) mm.c:2241:d4 Mfn 1cc37 not pinned Further details at: https://bugzilla.redhat.com/436453Signed-off-by: NMark McLoughlin <markmc@redhat.com> Cc: xen-devel@lists.xensource.com Cc: Mark McLoughlin <markmc@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mark McLoughlin 提交于
Signed-off-by: NMark McLoughlin <markmc@redhat.com> Cc: xen-devel@lists.xensource.com Cc: Mark McLoughlin <markmc@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Pavel Machek 提交于
Fix obsolete printks in aperture-64. We used not to handle missing agpgart, but we handle it okay now. Signed-off-by: NPavel Machek <pavel@suse.cz> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Michael Abd-El-Malek 提交于
fix memory corruption and crash due to mis-sized grant table. A PV OS has two grant table data structures: the grant table itself and a free list. The free list is composed of an array of pages, which grow dynamically as the guest OS requires more grants. While the grant table contains 8-byte entries, the free list contains 4-byte entries. So we have half as many pages in the free list than in the grant table. There was a bug in the free list allocation code. The free list was indexed as if it was the same size as the grant table. But it's only half as large. So memory got corrupted, and I was seeing crashes in the slab allocator later on. Taken from: http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/4018c0da3360Signed-off-by: NMichael Abd-El-Malek <mabdelmalek@cmu.edu> Signed-off-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ravikiran G Thirumalai 提交于
25-rc* stopped working with CONFIG_X86_VSMP on vSMP machines. Looks like the vsmp irq ops got accidentally removed during merge of x86_64 pvops in 2.6.25. -- commit 6abcd98f removed vsmp irq ops. Tested with both CONFIG_X86_VSMP and without CONFIG_X86_VSMP, on vSMP and non vSMP x86_64 machines. Please apply. Signed-off-by: NRavikiran Thirumalai <kiran@scalex86.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
right now if there's no CPU support for nmi_watchdog=2 we'll just refuse it silently. print a useful warning. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
implement nmi_watchdog=2 on this class of CPUs: cpu family : 15 model : 6 model name : Intel(R) Pentium(R) D CPU 3.00GHz the watchdog's ->setup() method is safe anyway, so if the CPU cannot support it we'll bail out safely. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 04 4月, 2008 12 次提交
-
-
由 Tejun Heo 提交于
ATAPI DMA just doesn't work reliably on pata_ali. The IDE driver can do it but for some mysterious reason, pata_ali can't. This patch disables it by default and makes the driver whine during initialization. "pata_ali.atapi_dma" parameter is added so that user can bypass the workaround. Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jeff@garzik.org>
-
由 Tejun Heo 提交于
SAT passthrus don't really fit into ATAPI_MISC class. SAT passthru commands always transfer multiple of 512 bytes and variable length response is not allowed. This patch creates a separate category - ATAPI_PASS_THRU - for these. This fixes HSM violation on "hdparm -I". Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jeff@garzik.org>
-
由 Tejun Heo 提交于
Uninline atapi_cmd_type(). It doesn't really have to be inline and more case will be added which need to access unexported libata variable. Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jeff@garzik.org>
-
Commit f58229f8 accidentally made ata_bus_probe() not use reverse order probing. Fix it. There currently isn't any PATA driver which uses obsolete ata_bus_probe() path, so this patch is mainly for correctness. Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: NTejun Heo <htejun@gmail.com> Signed-off-by: NJeff Garzik <jeff@garzik.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6由 Linus Torvalds 提交于
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: selinux: prevent rentry into the FS
-
由 Roland McGrath 提交于
This avoids using wrmsr on MSR_IA32_DEBUGCTLMSR when it's not needed. No wrmsr ever needs to be done if noone has ever used block stepping. Without this change, using ptrace on 2.6.25 on an x86 KVM guest will tickle KVM's missing support for the MSR and crash the guest kernel. Though host KVM is the buggy one, this makes for a regression in the guest behavior from 2.6.24->2.6.25 that we can easily avoid. I also corrected some bad whitespace. Signed-off-by: NRoland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input由 Linus Torvalds 提交于
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: appletouch - add product IDs for the 4th generation MacBooks
-
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc由 Linus Torvalds 提交于
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Fix MPC5200 (not B!) device tree so FEC ethernet works [POWERPC] mpc5200: Amalgamated DTS fixes and updates [POWERPC] Fix rtas_flash procfs interface [POWERPC] Fix deadlock with mmu_hash_lock in hash_page_sync [POWERPC] Fix iSeries hard irq enabling regression [POWERPC] Fix CPM2 SCC1 clock initialization. [POWERPC] Fix defconfigs so we dont set both GENRTC and RTCLIB [POWERPC] fsldma: Use compatiable binding as spec [POWERPC] sata_fsl: reduce compatibility to fsl,pq-sata [POWERPC] 83xx: enable usb in 837x rdb and 83xx defconfigs [POWERPC] 83xx: Fix wrong USB phy type in mpc837xrdb dts
-
由 Sven Schnelle 提交于
Signed-off-by: NSven Schnelle <svens@stackframe.org> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Sven Schnelle 提交于
Signed-off-by: NSven Schnelle <svens@stackframe.org> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Hugh Dickins 提交于
The loop block driver is careful to mask __GFP_IO|__GFP_FS out of its mapping_gfp_mask, to avoid hangs under memory pressure. But nowadays it uses splice, usually going through __generic_file_splice_read. That must use mapping_gfp_mask instead of GFP_KERNEL to avoid those hangs. Signed-off-by: NHugh Dickins <hugh@veritas.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Josef Bacik 提交于
BUG fix. Keep us from re-entering the fs when we aren't supposed to. See discussion at http://marc.info/?t=120716967100004&r=1&w=2Signed-off-by: NJosef Bacik <jbacik@redhat.com> Acked-by: NStephen Smalley <sds@tycho.nsa.gov> Signed-off-by: NJames Morris <jmorris@namei.org>
-
- 03 4月, 2008 10 次提交
-
-
由 René Bürgel 提交于
This gets the FEC ethernet driver working again on the lite5200 platform. The FEC driver is also compatible with the MPC5200, not only with the MPC5200B, so this adds a suitable entry to the driver's match list. Furthermore this adds the settings for the PHY in the dts file for the Lite5200. Note, that this is not exactly the same as in the Lite5200B, because the PHY is located at f0003000:01 for the 5200, and at :00 for the 5200B. This was tested on a Lite5200 and a Lite5200B, both booted a kernel via tftp and mounted the root via nfs successfully. Signed-off-by: NRené Bürgel <r.buergel@unicontrol.de> Acked-by: NGrant Likely <grant.likely@secretlab.ca> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Bartlomiej Sieka 提交于
DTS updates that fix booting problems on mpc5200-based boards: - change to ethernet reg property - addition of mdio and phy nodes - removal of pci node (Motion-Pro board) Other DTS updates: - update i2c device tree nodes - add lpb bus node and flash device (without partitions defined) - add rtc i2c nodes Signed-off-by: NMarian Balakowicz <m8@semihalf.com> Acked-by: NGrant Likely <grant.likely@secretlab.ca> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Maxim Shchetynin 提交于
Handling of the proc_dir_entry->count was changed in 2.6.24-rc5. After this change, the default value for pde->count is 1 and not 0 as before. Therefore, if we want to check whether our procfs file is already opened (already in use), we have to check if pde->count is greater than 2 rather than 1. Signed-off-by: NMaxim Shchetynin <maxim@de.ibm.com> Signed-off-by: NJens Osterkamp <jens@de.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Benjamin Herrenschmidt 提交于
hash_page_sync() takes and releases the low level mmu hash lock in order to sync with other processors disposing of page tables. Because that lock can be needed to service hash misses triggered by interrupt handlers, taking it must be done with interrupts off. However, hash_page_sync() appears to be called with interrupts enabled, thus causing occasional deadlocks. We fix it by making sure hash_page_sync() masks interrupts while holding the lock. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Benjamin Herrenschmidt 提交于
A subtle bug sneaked into iSeries recently. On this platform, we must not normally clear MSR:EE (the hardware external interrupt enable) except for short periods of time. Taking an interrupt while soft-disabled doesn't cause us to clear it for example. The iSeries kernel expects to mostly run with MSR:EE enabled at all times except in a few exception entry/exit code paths. Thus local_irq_enable() doesn't check if it needs to hard-enable as it expects this to be unnecessary on iSeries. However, hard_irq_disable() _does_ cause MSR:EE to be cleared, including on iSeries. A call to it was recently added to the context switch code, thus causing interrupts to become disabled for a long periods of time, causing the iSeries watchdog to kick in under some circumstances and other nasty things. This patch fixes it by making local_irq_enable() properly re-enable MSR:EE on iSeries. It basically removes a return statement here to make iSeries use the same code path as everybody else. That does mean that we might occasionally get spurious decrementer interrupts but I don't think that matters. Another option would have been to make hard_irq_disable() a nop on iSeries but I didn't like it much, in case we have good reasons to hard-disable. Part of the patch is fixes to make sure the hard_enabled PACA field is properly set on iSeries as it used not to be before, since it was mostly unused. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Laurent Pinchart 提交于
A missing break statement in a switch caused cpm2_clk_setup() to initialize SCC2 instead of SCC1. Signed-off-by: NLaurent Pinchart <laurentp@cse-semaphore.com> Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6由 Linus Torvalds 提交于
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: USB: ohci: fix 2 timers to fire at jiffies + 1s USB: Allow initialization of broken keyspan serial adapters. USB: fix bug in sg initialization in usbtest USB: serial: fix regression in Visor/Palm OS module for kernels >= 2.6.24 USB: cp2101: Add identifiers for the Telegesys ETRX2USB USB: serial: ti_usb_3410_5052: Correct TUSB3410 endpoint requirements. USB: another ehci_iaa_watchdog fix
-
由 Andrew Morton 提交于
A nasty compile error: In file included from security/keys/internal.h:16, from security/keys/sysctl.c:14: include/linux/key-ui.h: In function 'key_permission': include/linux/key-ui.h:51: error: invalid use of undefined type 'struct task_struct' apparently the compiler has decided that it needs to know sizeof(task_struct) so that it can add zero to a task_struct* (which is rather dumb of it). Getting task_struct in scope in these deeply-nested headers is scary-looking, so let's just remove the "+ 0". Cc: David Howells <dhowells@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mathieu Desnoyers 提交于
Markers do not mix well with CONFIG_PREEMPT_RCU because it uses preempt_disable/enable() and not rcu_read_lock/unlock for minimal intrusiveness. We would need call_sched and sched_barrier primitives. Currently, the modification (connection and disconnection) of probes from markers requires changes to the data structure done in RCU-style : a new data structure is created, the pointer is changed atomically, a quiescent state is reached and then the old data structure is freed. The quiescent state is reached once all the currently running preempt_disable regions are done running. We use the call_rcu mechanism to execute kfree() after such quiescent state has been reached. However, the new CONFIG_PREEMPT_RCU version of call_rcu and rcu_barrier does not guarantee that all preempt_disable code regions have finished, hence the race. The "proper" way to do this is to use rcu_read_lock/unlock, but we don't want to use it to minimize intrusiveness on the traced system. (we do not want the marker code to call into much of the OS code, because it would quickly restrict what can and cannot be instrumented, such as the scheduler). The temporary fix, until we get call_rcu_sched and rcu_barrier_sched in mainline, is to use synchronize_sched before each call_rcu calls, so we wait for the quiescent state in the system call code path. It will slow down batch marker enable/disable, but will make sure the race is gone. Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ken'ichi Ohmichi 提交于
Fix the problem that makedumpfile sometimes fails on x86_64 machine. This patch adds the symbol "phys_base" to a vmcoreinfo data. The vmcoreinfo data has the minimum debugging information only for dump filtering. makedumpfile (dump filtering command) gets it to distinguish unnecessary pages, and makedumpfile creates a small dumpfile. On x86_64 kernel which compiled with CONFIG_PHYSICAL_START=0x0 and CONFIG_RELOCATABLE=y, makedumpfile fails like the following: # makedumpfile -d31 /proc/vmcore dumpfile The kernel version is not supported. The created dumpfile may be incomplete. _exclude_free_page: Can't get next online node. makedumpfile Failed. # The cause is the lack of the symbol "phys_base" in a vmcoreinfo data. If the symbol "phys_base" does not exist, makedumpfile considers an x86_64 kernel as non relocatable. As the result, makedumpfile misunderstands the physical address where the kernel is loaded, and it cannot translate a kernel virtual address to physical address correctly. To fix this problem, this patch adds the symbol "phys_base" to a vmcoreinfo data. Signed-off-by: NKen'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <stable@kernel.org> Acked-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-