- 17 12月, 2008 1 次提交
-
-
由 Mike Travis 提交于
Impact: cleanup, change parameter passing * Change genapic interfaces to accept cpumask_t pointers where possible. * Modify external callers to use cpumask_t pointers in function calls. * Create new send_IPI_mask_allbutself which is the same as the send_IPI_mask functions but removes smp_processor_id() from list. This removes another common need for a temporary cpumask_t variable. * Functions that used a temp cpumask_t variable for: cpumask_t allbutme = cpu_online_map; cpu_clear(smp_processor_id(), allbutme); if (!cpus_empty(allbutme)) ... become: if (!cpus_equal(cpu_online_map, cpumask_of_cpu(cpu))) ... * Other minor code optimizations (like using cpus_clear instead of CPU_MASK_NONE, etc.) Applies to linux-2.6.tip/master. Signed-off-by: NMike Travis <travis@sgi.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Acked-by: NIngo Molnar <mingo@elte.hu>
-
- 09 12月, 2008 1 次提交
-
-
由 Ingo Molnar 提交于
Impact: build fix Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 08 12月, 2008 3 次提交
-
-
由 Yinghai Lu 提交于
Impact: sanitize MSI irq number ordering from top-down to bottom-up Increase new MSI IRQs starting from nr_irqs_gsi (which is somewhere below 256), instead of decreasing from NR_IRQS. (The latter method can result in confusingly high IRQ numbers - if NR_CPUS is set to a high value and NR_IRQS scales up to a high value.) Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yinghai Lu 提交于
Impact: cleanup Introduce NR_IRQS_LEGACY instead of hard coded number. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yinghai Lu 提交于
Impact: new feature Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with NR_CPUS set to large values. The goal is to be able to scale up to much larger NR_IRQS value without impacting the (important) common case. To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of irq_desc pointers. When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc, this also makes the IRQ descriptors NUMA-local (to the site that calls request_irq()). This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now uses desc->chip_data for x86 to store irq_cfg. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 03 12月, 2008 1 次提交
-
-
由 Richard Kennedy 提交于
Remove 16 bytes of padding from struct amd_iommu on 64bit builds reducing its size to 120 bytes, allowing it to span one fewer cachelines. Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
-
- 01 12月, 2008 2 次提交
-
-
由 Mahesh Salgaonkar 提交于
Impact: do not expose a control that has no effect Fix to prevent sched_mc_power_saving from being exported through sysfs on single-socket systems. (Say multicore single socket (Laptop)) CPU core map of the boot cpu should be equal to possible number of cpus for single socket system. This fix has been developed at FOSS.in kernel workout. Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Christoph Hellwig 提交于
All architectures now use the generic compat_sys_ptrace, as should every new architecture that needs 32bit compat (if we'll ever get another). Remove the now superflous __ARCH_WANT_COMPAT_SYS_PTRACE define, and also kill a comment about __ARCH_SYS_PTRACE that was added after __ARCH_SYS_PTRACE was already gone. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 11月, 2008 1 次提交
-
-
由 Thomas Bogendoerfer 提交于
Devices like b44 ethernet can't dma from addresses above 1GB. The driver handles this cases by falling back to GFP_DMA allocation. But for detecting the problem it needs to get an indication from dma_mapping_error. The bug is triggered by using a VMSPLIT option of 2G/2G. Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 27 11月, 2008 1 次提交
-
-
由 Joerg Roedel 提交于
Impact: fix boot crash on AMD IOMMU if CONFIG_GART_IOMMU is off Currently these macros evaluate to a no-op except the kernel is compiled with GART or Calgary support. But we also need these macros when we have SWIOTLB, VT-d or AMD IOMMU in the kernel. Since we always compile at least with SWIOTLB we can define these macros always. This patch is also for stable backport for the same reason the SWIOTLB default selection patch is. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Cc: <stable@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 26 11月, 2008 4 次提交
-
-
由 Frederic Weisbecker 提交于
Impact: cleanup This patch changes the name of the "return function tracer" into function-graph-tracer which is a more suitable name for a tracing which makes one able to retrieve the ordered call stack during the code flow. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Markus Metzger 提交于
Impact: restructure DS memory allocation to be done by the usage site of DS Require pre-allocated buffers in ds.h. Move the BTS buffer allocation for ptrace into ptrace.c. The pointer to the allocated buffer is stored in the traced task's task_struct together with the handle returned by ds_request_bts(). Removes memory accounting code. Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Markus Metzger 提交于
Impact: generalize the DS code to shared buffers Change the in-kernel ds.h interface to identify the tracer via a handle returned on ds_request_~(). Tracers used to be identified via their task_struct. The changes are required to allow DS to be shared between different tasks, which is needed for perfmon2 and for ftrace. For ptrace, the handle is stored in the traced task's task_struct. This should probably go into a (arch-specific) ptrace context some time. Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Markus Metzger 提交于
Impact: cleanup Replace a macro with a static inline function. Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 23 11月, 2008 1 次提交
-
-
由 Frederic Weisbecker 提交于
Impact: use deeper function tracing depth safely Some tests showed that function return tracing needed a more deeper depth of function calls. But it could be unsafe to store these return addresses to the stack. So these arrays will now be allocated dynamically into task_struct of current only when the tracer is activated. Typical scheme when tracer is activated: - allocate a return stack for each task in global list. - fork: allocate the return stack for the newly created task - exit: free return stack of current - idle init: same as fork I chose a default depth of 50. I don't have overruns anymore. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 20 11月, 2008 1 次提交
-
-
由 Ulrich Drepper 提交于
Introduce a new accept4() system call. The addition of this system call matches analogous changes in 2.6.27 (dup3(), evenfd2(), signalfd4(), inotify_init1(), epoll_create1(), pipe2()) which added new system calls that differed from analogous traditional system calls in adding a flags argument that can be used to access additional functionality. The accept4() system call is exactly the same as accept(), except that it adds a flags bit-mask argument. Two flags are initially implemented. (Most of the new system calls in 2.6.27 also had both of these flags.) SOCK_CLOEXEC causes the close-on-exec (FD_CLOEXEC) flag to be enabled for the new file descriptor returned by accept4(). This is a useful security feature to avoid leaking information in a multithreaded program where one thread is doing an accept() at the same time as another thread is doing a fork() plus exec(). More details here: http://udrepper.livejournal.com/20407.html "Secure File Descriptor Handling", Ulrich Drepper). The other flag is SOCK_NONBLOCK, which causes the O_NONBLOCK flag to be enabled on the new open file description created by accept4(). (This flag is merely a convenience, saving the use of additional calls fcntl(F_GETFL) and fcntl (F_SETFL) to achieve the same result. Here's a test program. Works on x86-32. Should work on x86-64, but I (mtk) don't have a system to hand to test with. It tests accept4() with each of the four possible combinations of SOCK_CLOEXEC and SOCK_NONBLOCK set/clear in 'flags', and verifies that the appropriate flags are set on the file descriptor/open file description returned by accept4(). I tested Ulrich's patch in this thread by applying against 2.6.28-rc2, and it passes according to my test program. /* test_accept4.c Copyright (C) 2008, Linux Foundation, written by Michael Kerrisk <mtk.manpages@gmail.com> Licensed under the GNU GPLv2 or later. */ #define _GNU_SOURCE #include <unistd.h> #include <sys/syscall.h> #include <sys/socket.h> #include <netinet/in.h> #include <stdlib.h> #include <fcntl.h> #include <stdio.h> #include <string.h> #define PORT_NUM 33333 #define die(msg) do { perror(msg); exit(EXIT_FAILURE); } while (0) /**********************************************************************/ /* The following is what we need until glibc gets a wrapper for accept4() */ /* Flags for socket(), socketpair(), accept4() */ #ifndef SOCK_CLOEXEC #define SOCK_CLOEXEC O_CLOEXEC #endif #ifndef SOCK_NONBLOCK #define SOCK_NONBLOCK O_NONBLOCK #endif #ifdef __x86_64__ #define SYS_accept4 288 #elif __i386__ #define USE_SOCKETCALL 1 #define SYS_ACCEPT4 18 #else #error "Sorry -- don't know the syscall # on this architecture" #endif static int accept4(int fd, struct sockaddr *sockaddr, socklen_t *addrlen, int flags) { printf("Calling accept4(): flags = %x", flags); if (flags != 0) { printf(" ("); if (flags & SOCK_CLOEXEC) printf("SOCK_CLOEXEC"); if ((flags & SOCK_CLOEXEC) && (flags & SOCK_NONBLOCK)) printf(" "); if (flags & SOCK_NONBLOCK) printf("SOCK_NONBLOCK"); printf(")"); } printf("\n"); #if USE_SOCKETCALL long args[6]; args[0] = fd; args[1] = (long) sockaddr; args[2] = (long) addrlen; args[3] = flags; return syscall(SYS_socketcall, SYS_ACCEPT4, args); #else return syscall(SYS_accept4, fd, sockaddr, addrlen, flags); #endif } /**********************************************************************/ static int do_test(int lfd, struct sockaddr_in *conn_addr, int closeonexec_flag, int nonblock_flag) { int connfd, acceptfd; int fdf, flf, fdf_pass, flf_pass; struct sockaddr_in claddr; socklen_t addrlen; printf("=======================================\n"); connfd = socket(AF_INET, SOCK_STREAM, 0); if (connfd == -1) die("socket"); if (connect(connfd, (struct sockaddr *) conn_addr, sizeof(struct sockaddr_in)) == -1) die("connect"); addrlen = sizeof(struct sockaddr_in); acceptfd = accept4(lfd, (struct sockaddr *) &claddr, &addrlen, closeonexec_flag | nonblock_flag); if (acceptfd == -1) { perror("accept4()"); close(connfd); return 0; } fdf = fcntl(acceptfd, F_GETFD); if (fdf == -1) die("fcntl:F_GETFD"); fdf_pass = ((fdf & FD_CLOEXEC) != 0) == ((closeonexec_flag & SOCK_CLOEXEC) != 0); printf("Close-on-exec flag is %sset (%s); ", (fdf & FD_CLOEXEC) ? "" : "not ", fdf_pass ? "OK" : "failed"); flf = fcntl(acceptfd, F_GETFL); if (flf == -1) die("fcntl:F_GETFD"); flf_pass = ((flf & O_NONBLOCK) != 0) == ((nonblock_flag & SOCK_NONBLOCK) !=0); printf("nonblock flag is %sset (%s)\n", (flf & O_NONBLOCK) ? "" : "not ", flf_pass ? "OK" : "failed"); close(acceptfd); close(connfd); printf("Test result: %s\n", (fdf_pass && flf_pass) ? "PASS" : "FAIL"); return fdf_pass && flf_pass; } static int create_listening_socket(int port_num) { struct sockaddr_in svaddr; int lfd; int optval; memset(&svaddr, 0, sizeof(struct sockaddr_in)); svaddr.sin_family = AF_INET; svaddr.sin_addr.s_addr = htonl(INADDR_ANY); svaddr.sin_port = htons(port_num); lfd = socket(AF_INET, SOCK_STREAM, 0); if (lfd == -1) die("socket"); optval = 1; if (setsockopt(lfd, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof(optval)) == -1) die("setsockopt"); if (bind(lfd, (struct sockaddr *) &svaddr, sizeof(struct sockaddr_in)) == -1) die("bind"); if (listen(lfd, 5) == -1) die("listen"); return lfd; } int main(int argc, char *argv[]) { struct sockaddr_in conn_addr; int lfd; int port_num; int passed; passed = 1; port_num = (argc > 1) ? atoi(argv[1]) : PORT_NUM; memset(&conn_addr, 0, sizeof(struct sockaddr_in)); conn_addr.sin_family = AF_INET; conn_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK); conn_addr.sin_port = htons(port_num); lfd = create_listening_socket(port_num); if (!do_test(lfd, &conn_addr, 0, 0)) passed = 0; if (!do_test(lfd, &conn_addr, SOCK_CLOEXEC, 0)) passed = 0; if (!do_test(lfd, &conn_addr, 0, SOCK_NONBLOCK)) passed = 0; if (!do_test(lfd, &conn_addr, SOCK_CLOEXEC, SOCK_NONBLOCK)) passed = 0; close(lfd); exit(passed ? EXIT_SUCCESS : EXIT_FAILURE); } [mtk.manpages@gmail.com: rewrote changelog, updated test program] Signed-off-by: NUlrich Drepper <drepper@redhat.com> Tested-by: NMichael Kerrisk <mtk.manpages@gmail.com> Acked-by: NMichael Kerrisk <mtk.manpages@gmail.com> Cc: <linux-api@vger.kernel.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 11月, 2008 2 次提交
-
-
由 Hiroshi Shimamoto 提交于
__copy_from_user() will return invalid value 16 when it fails to access user space and the size is 10. Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yinghai Lu 提交于
Impact: clean up We can autodetect those system that need cluster apic, and update genapic accordingly. We can also remove wakeup.h for e7000, because it's default one is now the same as overall default mach_wakecpu.h Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 11月, 2008 3 次提交
-
-
由 Frederic Weisbecker 提交于
Impact: help to find the better depth of trace We decided to arbitrary define the depth of function return trace as "20". Perhaps this is not enough. To help finding an optimal depth, we measure now the overrun: the number of functions that have been missed for the current thread. By default this is not displayed, we have to do set a particular flag on the return tracer: echo overrun > /debug/tracing/trace_options And the overrun will be printed on the right. As the trace shows below, the current 20 depth is not enough. update_wall_time+0x37f/0x8c0 -> update_xtime_cache (345 ns) (Overruns: 2838) update_wall_time+0x384/0x8c0 -> clocksource_get_next (1141 ns) (Overruns: 2838) do_timer+0x23/0x100 -> update_wall_time (3882 ns) (Overruns: 2838) tick_do_update_jiffies64+0xbf/0x160 -> do_timer (5339 ns) (Overruns: 2838) tick_sched_timer+0x6a/0xf0 -> tick_do_update_jiffies64 (7209 ns) (Overruns: 2838) vgacon_set_cursor_size+0x98/0x120 -> native_io_delay (2613 ns) (Overruns: 274) vgacon_cursor+0x16e/0x1d0 -> vgacon_set_cursor_size (33151 ns) (Overruns: 274) set_cursor+0x5f/0x80 -> vgacon_cursor (36432 ns) (Overruns: 274) con_flush_chars+0x34/0x40 -> set_cursor (38790 ns) (Overruns: 274) release_console_sem+0x1ec/0x230 -> up (721 ns) (Overruns: 274) release_console_sem+0x225/0x230 -> wake_up_klogd (316 ns) (Overruns: 274) con_flush_chars+0x39/0x40 -> release_console_sem (2996 ns) (Overruns: 274) con_write+0x22/0x30 -> con_flush_chars (46067 ns) (Overruns: 274) n_tty_write+0x1cc/0x360 -> con_write (292670 ns) (Overruns: 274) smp_apic_timer_interrupt+0x2a/0x90 -> native_apic_mem_write (330 ns) (Overruns: 274) irq_enter+0x17/0x70 -> idle_cpu (413 ns) (Overruns: 274) smp_apic_timer_interrupt+0x2f/0x90 -> irq_enter (1525 ns) (Overruns: 274) ktime_get_ts+0x40/0x70 -> getnstimeofday (465 ns) (Overruns: 274) ktime_get_ts+0x60/0x70 -> set_normalized_timespec (436 ns) (Overruns: 274) ktime_get+0x16/0x30 -> ktime_get_ts (2501 ns) (Overruns: 274) hrtimer_interrupt+0x77/0x1a0 -> ktime_get (3439 ns) (Overruns: 274) Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yinghai Lu 提交于
Impact: fix wakeup_secondary_cpu with hotplug We can not put that into x86_quirks, because that is __initdata. So try to move that to genapic, and add update_genapic in x86_quirks. later we even could use that stub to: 1. autodetect CONFIG_ES7000_CLUSTERED_APIC 2. more correct inquire_remote_apic with apic_verbosity setting. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yinghai Lu 提交于
Impact: fix secondary-CPU wakeup/init path with numaq and es7000 While looking at wakeup_secondary_cpu for WAKE_SECONDARY_VIA_NMI: |#ifdef WAKE_SECONDARY_VIA_NMI |/* | * Poke the other CPU in the eye via NMI to wake it up. Remember that the normal | * INIT, INIT, STARTUP sequence will reset the chip hard for us, and this | * won't ... remember to clear down the APIC, etc later. | */ |static int __devinit |wakeup_secondary_cpu(int logical_apicid, unsigned long start_eip) |{ | unsigned long send_status, accept_status = 0; | int maxlvt; |... | if (APIC_INTEGRATED(apic_version[phys_apicid])) { | maxlvt = lapic_get_maxlvt(); I noticed that there is no warning about undefined phys_apicid... because WAKE_SECONDARY_VIA_NMI and WAKE_SECONDARY_VIA_INIT can not be defined at the same time. So NUMAQ is using wrong wakeup_secondary_cpu. WAKE_SECONDARY_VIA_NMI, WAKE_SECONDARY_VIA_INIT and WAKE_SECONDARY_VIA_MIP are variants of a weird and fragile preprocessor-driven "HAL" mechanisms to specify the kind of secondary-CPU wakeup strategy a given x86 kernel will use. The vast majority of systems want to use INIT for secondary wakeup - NUMAQ uses an NMI, (old-style-) ES7000 uses 'MIP' (a firmware driven in-memory flag to let secondaries continue). So convert these mechanisms to x86_quirks and add a ->wakeup_secondary_cpu() method to specify the rare exception to the sane default. Extend genapic accordingly as well, for 32-bit. While looking further, I noticed that functions in wakecup.h for numaq and es7000 are different to the default in mach_wakecpu.h - but smpboot.c will only use default mach_wakecpu.h with smphook.h. So we need to add mach_wakecpu.h for mach_generic, to properly support numaq and es7000, and vectorize the following SMP init methods: int trampoline_phys_low; int trampoline_phys_high; void (*wait_for_init_deassert)(atomic_t *deassert); void (*smp_callin_clear_local_apic)(void); void (*store_NMI_vector)(unsigned short *high, unsigned short *low); void (*restore_NMI_vector)(unsigned short *high, unsigned short *low); void (*inquire_remote_apic)(int apicid); Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 16 11月, 2008 2 次提交
-
-
由 Steven Rostedt 提交于
Impact: allow archs more flexibility on dynamic ftrace implementations Dynamic ftrace has largly been developed on x86. Since x86 does not have the same limitations as other architectures, the ftrace interaction between the generic code and the architecture specific code was not flexible enough to handle some of the issues that other architectures have. Most notably, module trampolines. Due to the limited branch distance that archs make in calling kernel core code from modules, the module load code must create a trampoline to jump to what will make the larger jump into core kernel code. The problem arises when this happens to a call to mcount. Ftrace checks all code before modifying it and makes sure the current code is what it expects. Right now, there is not enough information to handle modifying module trampolines. This patch changes the API between generic dynamic ftrace code and the arch dependent code. There is now two functions for modifying code: ftrace_make_nop(mod, rec, addr) - convert the code at rec->ip into a nop, where the original text is calling addr. (mod is the module struct if called by module init) ftrace_make_caller(rec, addr) - convert the code rec->ip that should be a nop into a caller to addr. The record "rec" now has a new field called "arch" where the architecture can add any special attributes to each call site record. Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 David Woodhouse 提交于
This reverts commit e51af663, which was wrongly hoovered up and submitted about a month after a better fix had already been merged. The better fix is commit cbda1ba8 ("PCI/iommu: blacklist DMAR on Intel G31/G33 chipsets"), where we do this blacklisting based on the DMI identification for the offending motherboard, since sometimes this chipset (or at least a chipset with the same PCI ID) apparently _does_ actually have an IOMMU. Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 11月, 2008 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Impact: fix crash during hibernation on 32-bit NUMA The NUMA code on x86_32 creates special memory mapping that allows each node's pgdat to be located in this node's memory. For this purpose it allocates a memory area at the end of each node's memory and maps this area so that it is accessible with virtual addresses belonging to low memory. As a result, if there is high memory, these NUMA-allocated areas are physically located in high memory, although they are mapped to low memory addresses. Our hibernation code does not take that into account and for this reason hibernation fails on all x86_32 systems with CONFIG_NUMA=y and with high memory present. Fix this by adding a special mapping for the NUMA-allocated memory areas to the temporary page tables created during the last phase of resume. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 11月, 2008 2 次提交
-
-
由 Bjorn Helgaas 提交于
This removes the acpi_irq_balance_set() interface from the PCI interrupt link driver. x86 used acpi_irq_balance_set() to tell the PCI interrupt link driver to configure links to minimize IRQ sharing. But the link driver can easily figure out whether to turn on IRQ balancing based on the IRQ model (PIC/IOAPIC/etc), so we can get rid of that external interface. It's better for the driver to figure this out at init-time. If we set it externally via the x86 code, the interface reduces modularity, and we depend on the fact that acpi_process_madt() happens before we process the kernel command line. Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: NLen Brown <len.brown@intel.com>
-
由 H. Peter Anvin 提交于
Impact: Changes reboot behavior. If port CF9 seems to be safe to touch, attempt it before trying the keyboard controller. Port CF9 is not available on all chipsets (a significant but decreasing number of modern chipsets don't implement it), but port CF9 itself should in general be safe to poke (no ill effects if unimplemented) on any system which has PCI Configuration Method #1 or #2, as it falls inside the PCI configuration port range in both cases. No chipset without PCI is known to have port CF9, either, although an explicit "pci=bios" would mean we miss this and therefore don't use port CF9. An explicit "reboot=pci" can be used to force the use of port CF9. Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 11 11月, 2008 2 次提交
-
-
由 Ivan Vecera 提交于
Impact: really halt all CPUs on halt Function machine_halt (resp. native_machine_halt) is empty for x86 architectures. When command 'halt -f' is invoked, the message "System halted." is displayed but this is not really true because all CPUs are still running. There are also similar inconsistencies for other arches (some uses power-off for halt or forever-loop with IRQs enabled/disabled). IMO there should be used the same approach for all architectures OR what does the message "System halted" really mean? This patch fixes it for x86. Signed-off-by: NIvan Vecera <ivecera@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
Impact: add infrastructure for function-return tracing Add low level support for ftrace return tracing. This plug-in stores return addresses on the thread_info structure of the current task. The index of the current return address is initialized when the task is the first one (init) and when a process forks (the child). It is not needed when a task does a sys_execve because after this syscall, it still needs to return on the kernel functions it called. Note that the code of return_to_handler has been suggested by Steven Rostedt as almost all of the ideas of improvements in this V3. For purpose of security, arch/x86/kernel/process_32.c is not traced because __switch_to() changes the current task during its execution. That could cause inconsistency in the stored return address of this function even if I didn't have any crash after testing with tracing on this function enabled. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 10 11月, 2008 1 次提交
-
-
由 Arjan van de Ven 提交于
a new file was accidentally added to include/asm-x86; move it to the new arch/x86/include/asm location Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
-
- 08 11月, 2008 1 次提交
-
-
由 Ingo Molnar 提交于
in scheduler-intense workloads native_read_tsc() overhead accounts for 20% of the system overhead: 659567 system_call 41222.9375 686796 schedule 435.7843 718382 __switch_to 665.1685 823875 switch_mm 4526.7857 1883122 native_read_tsc 55385.9412 9761990 total 2.8468 this is large part due to the rdtsc_barrier() that is done before and after reading the TSC. But sched_clock() is not a precise clock in the GTOD sense, using such barriers is completely pointless. So remove the barriers and only use them in vget_cycles(). This improves lat_ctx performance by about 5%. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 06 11月, 2008 3 次提交
-
-
由 Yinghai Lu 提交于
Impact: fix warning message when PARAVIRT is set in config Remove stale #ifdef components from our IRQ sizing logic. x86/Voyager is the only holdout. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yinghai Lu 提交于
Impact: make NR_IRQS big enough for system with lots of apic/pins If lots of IO_APIC's are there (or can be there), size the same way as 64-bit, depending on MAX_IO_APICS and NR_CPUS. This fixes the boot problem reported by Ben Hutchings on a 32-bit server with 5 IO-APICs and 240 IO-APIC pins. Signed-off-by: NYinghai <yinghai@kernel.org> Tested-by: NBen Hutchings <bhutchings@solarflare.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Impact: improve wakeup affinity on NUMA systems, tweak SMP systems Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain balancing defaults on NUMA and SMP systems. Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason why we would not want to have wakeup affinity across nodes as well. (we already do this in the standard NUMA template.) lat_ctx on a NUMA box is particularly happy about this change: before: | phoenix:~/l> ./lat_ctx -s 0 2 | "size=0k ovr=2.60 | 2 5.70 after: | phoenix:~/l> ./lat_ctx -s 0 2 | "size=0k ovr=2.65 | 2 2.07 a 2.75x speedup. pipe-test is similarly happy about it too: | phoenix:~/sched-tests> ./pipe-test | 18.26 usecs/loop. | 14.70 usecs/loop. | 14.38 usecs/loop. | 10.55 usecs/loop. # +WAKE_AFFINE on domain0+domain1 | 8.63 usecs/loop. | 8.59 usecs/loop. | 9.03 usecs/loop. | 8.94 usecs/loop. | 8.96 usecs/loop. | 8.63 usecs/loop. Also: - disable SD_BALANCE_NEWIDLE on NUMA and SMP domains (keep it for siblings) - enable SD_WAKE_BALANCE on SMP domains Sysbench+postgresql improves all around the board, quite significantly: .28-rc3-11474e2c .28-rc3-11474e2c-tune ------------------------------------------------- 1: 571 688 +17.08% 2: 1236 1206 -2.55% 4: 2381 2642 +9.89% 8: 4958 5164 +3.99% 16: 9580 9574 -0.07% 32: 7128 8118 +12.20% 64: 7342 8266 +11.18% 128: 7342 8064 +8.95% 256: 7519 7884 +4.62% 512: 7350 7731 +4.93% ------------------------------------------------- SUM: 55412 59341 +6.62% So it's a win both for the runup portion, the peak area and the tail. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 03 11月, 2008 2 次提交
-
-
由 Steven Rostedt 提交于
Impact: build fix for non-ftrace architectures Not all archs implement ftrace, and therefore do not have an asm/ftrace.h. This patch corrects the problem. The ftrace_nmi_enter/exit now must be defined for all archs that implement dynamic ftrace. Currently, only x86 does. Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 James Bottomley 提交于
Impact: fix x86/Voyager build Looks like this became static on the rest of x86. Fix it up by adding an external definition to mach-voyager/setup.c Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 31 10月, 2008 5 次提交
-
-
由 Venki Pallipadi 提交于
Impact: fix xsave slowdown regression Fix two features from conflicting in feature bits. Fixes this performance regression: Subject: cpu2000(both float and int) 13% regression with 2.6.28-rc1 http://lkml.org/lkml/2008/10/28/36Reported-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Bisected-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Steven Rostedt 提交于
Impact: cleanup This patch cleans up the NMI safe code for dynamic ftrace as suggested by Andrew Morton. Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Keith Packard 提交于
Impact: introduce new APIs, separate kmap code from CONFIG_HIGHMEM This takes the code used for CONFIG_HIGHMEM memory mappings except that it's designed for dynamic IO resource mapping. These fixmaps are available even with CONFIG_HIGHMEM turned off. Signed-off-by: NKeith Packard <keithp@keithp.com> Signed-off-by: NEric Anholt <eric@anholt.net> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 James Bottomley 提交于
Impact: build fix on x86/Voyager Given commits like this: | Author: Suresh Siddha <suresh.b.siddha@intel.com> | Date: Tue Jul 29 10:29:19 2008 -0700 | | x86, xsave: enable xsave/xrstor on cpus with xsave support Which deliberately expose boot cpu dependence to pieces of the system, I think it's time to explicitly have a variable for it to prevent this continual misassumption that the boot CPU is zero. Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Steven Rostedt 提交于
Impact: fix crashes that can occur in NMI handlers, if their code is modified Modifying code is something that needs special care. On SMP boxes, if code that is being modified is also being executed on another CPU, that CPU will have undefined results. The dynamic ftrace uses kstop_machine to make the system act like a uniprocessor system. But this does not address NMIs, that can still run on other CPUs. One approach to handle this is to make all code that are used by NMIs not be traced. But NMIs can call notifiers that spread throughout the kernel and this will be very hard to maintain, and the chance of missing a function is very high. The approach that this patch takes is to have the NMIs modify the code if the modification is taking place. The way this works is that just writing to code executing on another CPU is not harmful if what is written is the same as what exists. Two buffers are used: an IP buffer and a "code" buffer. The steps that the patcher takes are: 1) Put in the instruction pointer into the IP buffer and the new code into the "code" buffer. 2) Set a flag that says we are modifying code 3) Wait for any running NMIs to finish. 4) Write the code 5) clear the flag. 6) Wait for any running NMIs to finish. If an NMI is executed, it will also write the pending code. Multiple writes are OK, because what is being written is the same. Then the patcher must wait for all running NMIs to finish before going to the next line that must be patched. This is basically the RCU approach to code modification. Thanks to Ingo Molnar for suggesting the idea, and to Arjan van de Ven for his guidence on what is safe and what is not. Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-