1. 24 4月, 2009 1 次提交
  2. 23 4月, 2009 1 次提交
    • Y
      x86: check boundary in setup_node_bootmem() · 4c31e92b
      Yinghai Lu 提交于
      Commit dc098551 ("x86/uv: fix init of memory-less nodes") causes a
      two sockets system (where node-1 doesn't have RAM installed) to crash.
      
      That commit makes node_possible include cpu nodes that do not have memory.
      So check boundary in setup_node_bootmem().
      
      [ Impact: fix boot crash on RAM-less NUMA node system ]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Jack Steiner <steiner@sgi.com>
      LKML-Reference: <49EF89DF.9090404@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4c31e92b
  3. 22 4月, 2009 8 次提交
  4. 21 4月, 2009 5 次提交
    • R
      x86: avoid theoretical spurious NMI backtraces with CONFIG_CPUMASK_OFFSTACK=y · fcc5c4a2
      Rusty Russell 提交于
      In theory (though not shown in practice) alloc_cpumask_var() doesn't zero
      memory, so CPUs might print an "NMI backtrace for cpu %d" once on boot.
      
      (Bug introduced in fcef8576).
      
      [ Impact: avoid theoretical syslog noise in rare configs ]
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <alpine.DEB.2.00.0904202113520.10097@gandalf.stny.rr.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fcc5c4a2
    • R
      x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC · 2f537a9f
      Rusty Russell 提交于
      fcef8576 converted backtrace_mask to a
      cpumask_var_t, and assumed check_nmi_watchdog was called before
      nmi_watchdog_tick was ever called.  Steven's oops shows I was wrong.
      
      This is something of a bandaid: I'm not sure we *should* be calling
      nmi_watchdog_tick before check_nmi_watchdog.  Note that gcc eliminates
      this test for the CONFIG_CPUMASK_OFFSTACK=n case.
      
      [ Impact: fix boot crash in rare configs ]
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      LKML-Reference: <alpine.DEB.2.00.0904202113520.10097@gandalf.stny.rr.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2f537a9f
    • O
      Separate out common fstatat code into vfs_fstatat · 0112fc22
      Oleg Drokin 提交于
      This is a version incorporating Christoph's suggestion.
      
      Separate out common *fstatat functionality into a single function
      instead of duplicating it all over the code.
      Signed-off-by: NOleg Drokin <green@linuxhacker.ru>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      0112fc22
    • S
      x86-64: fix FPU corruption with signals and preemption · 06c38d5e
      Suresh Siddha 提交于
      In 64bit signal delivery path, clear_used_math() was happening before saving
      the current active FPU state on to the user stack for signal handling. Between
      clear_used_math() and the state store on to the user stack, potentially we
      can get a page fault for the user address and can block. Infact, while testing
      we were hitting the might_fault() in __clear_user() which can do a schedule().
      
      At a later point in time, we will schedule back into this process and
      resume the save state (using "xsave/fxsave" instruction) which can lead
      to DNA fault. And as used_math was cleared before, we will reinit the FP state
      in the DNA fault and continue. This reinit will result in loosing the
      FPU state of the process.
      
      Move clear_used_math() to a point after the FPU state has been stored
      onto the user stack.
      
      This issue is present from a long time (even before the xsave changes
      and the x86 merge). But it can easily be exposed in 2.6.28.x and 2.6.29.x
      series because of the __clear_user() in this path, which has an explicit
      __cond_resched() leading to a context switch with CONFIG_PREEMPT_VOLUNTARY.
      
      [ Impact: fix FPU state corruption ]
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: <stable@kernel.org>			[2.6.28.x, 2.6.29.x]
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      06c38d5e
    • J
      x86/uv: fix for no memory at paddr 0 · fc61e663
      Jack Steiner 提交于
      Fix endcase where the memory at physical address 0 does not really
      exist AND one of the sockets on blade 0 has no active cpus.
      
      The memory that _appears_ to be at physical address 0 is actually
      memory that located at a different address but has been remapped by
      the chipset so that it appears to be at physical address 0.
      
      When determining the UV pnode, the algorithm for determining the pnode
      incorrectly used the relocated physical address instead of the actual
      (global) address.
      
      [ Impact: boot failure on partitioned systems ]
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090420132530.GA23156@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fc61e663
  5. 20 4月, 2009 4 次提交
    • T
      acpi-cpufreq: Do not let get_measured perf depend on internal variable · d876dfbb
      Thomas Renninger 提交于
      Take already available policy->cpuinfo.max_freq and get rid of acpi-cpufreq
      specific max_freq variable.
      
      This implies that P0 is always the highest frequency which should always
      be true as ACPI spec says:
      As a result, the zeroth entry describes the highest performance state
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Acked-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      d876dfbb
    • T
      d91758f5
    • T
      acpi-cpufreq: Cleanup: Use printk_once · e0e8c4e5
      Thomas Renninger 提交于
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      e0e8c4e5
    • P
      x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf · 093f13e2
      Pallipadi, Venkatesh 提交于
      Fix for a regression that was introduced by earlier commit
      18b2646f on Mon Apr 6 11:26:08 2009
      
      Regression resulted in the below error happened on systems with
      software coordination where per_cpu acpi data will not be initiated for
      secondary CPUs in a P-state domain.
      
      On Tue, 2009-04-14 at 23:01 -0700, Zhang, Yanmin wrote:
       My machine hanged with kernel 2.6.30-rc2 when script read
      > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor.
      >
      > opps happens in get_measured_perf:
      >
      >         cur.aperf.whole = readin.aperf.whole -
      >                                 per_cpu(drv_data, cpu)->saved_aperf;
      >
      > Because per_cpu(drv_data, cpu)=NULL.
      >
      > So function get_measured_perf should check if (per_cpu(drv_data,
      > cpu)==NULL)
      > and return 0 if it's NULL.
      
      --------------sys log------------------
      
      BUG: unable to handle kernel NULL pointer dereference at
      0000000000000020
      IP: [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9
      PGD a7dd88067 PUD a7ccf5067 PMD 0
      Oops: 0000 [#1] SMP
      last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
      CPU 0
      Modules linked in: video output
      Pid: 2091, comm: kondemand/0 Not tainted 2.6.30-rc2 #1 MP Server
      RIP: 0010:[<ffffffff8021af75>]  [<ffffffff8021af75>]
      get_measured_perf+0x4a/0xf9
      RSP: 0018:ffff880a7d56de20  EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 00000046241a42b6 RCX: ffff88004d219000
      RDX: 000000000000b660 RSI: 0000000000000020 RDI: 0000000000000001
      RBP: ffff880a7f052000 R08: 00000046241a42b6 R09: ffffffff807639f0
      R10: 00000000ffffffea R11: ffffffff802207f4 R12: ffff880a7f052000
      R13: ffff88004d20e460 R14: 0000000000ddd5a6 R15: 0000000000000001
      FS:  0000000000000000(0000) GS:ffff88004d200000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 0000000000000020 CR3: 0000000a7f1bf000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process kondemand/0 (pid: 2091, threadinfo ffff880a7d56c000, task
      ffff880a7d4d18c0)
      Stack:
       ffff880a7f052078 ffffffff803efd54 00000046241a42b6 000000462ffa9e95
       0000000000000001 0000000000000001 00000000ffffffea ffffffff8064f41a
       0000000000000012 0000000000000012 ffff880a7f052000 ffffffff80650547
      Call Trace:
       [<ffffffff803efd54>] ? kobject_get+0x12/0x17
       [<ffffffff8064f41a>] ? __cpufreq_driver_getavg+0x42/0x57
       [<ffffffff80650547>] ? do_dbs_timer+0x147/0x272
       [<ffffffff80650400>] ? do_dbs_timer+0x0/0x272
       [<ffffffff802474ca>] ? worker_thread+0x15b/0x1f5
       [<ffffffff8024a02c>] ? autoremove_wake_function+0x0/0x2e
       [<ffffffff8024736f>] ? worker_thread+0x0/0x1f5
       [<ffffffff80249f0d>] ? kthread+0x54/0x83
       [<ffffffff8020c87a>] ? child_rip+0xa/0x20
       [<ffffffff80249eb9>] ? kthread+0x0/0x83
       [<ffffffff8020c870>] ? child_rip+0x0/0x20
      Code: 99 a6 03 00 31 c9 85 c0 0f 85 c3 00 00 00 89 df 4c 8b 44 24 10 48
      c7 c2 60 b6 00 00 48 8b 0c fd e0 30 a5 80 4c 89 c3 48 8b 04 0a <48> 2b
      58 20 48 8b 44 24 18 48 89 1c 24 48 8b 34 0a 48 2b 46 28
      RIP  [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9
       RSP <ffff880a7d56de20>
      CR2: 0000000000000020
      ---[ end trace 2b8fac9a49e19ad4 ]---
      Tested-by: N"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      093f13e2
  6. 19 4月, 2009 1 次提交
  7. 18 4月, 2009 5 次提交
    • S
      lockdep, x86: account for irqs enabled in paranoid_exit · 0300e7f1
      Steven Rostedt 提交于
      I hit the check_flags error of lockdep:
      
       WARNING: at kernel/lockdep.c:2893 check_flags+0x1a7/0x1d0()
       [...]
       hardirqs last  enabled at (12567): [<ffffffff8026206a>] local_bh_enable+0xaa/0x110
       hardirqs last disabled at (12569): [<ffffffff80610c76>] int3+0x16/0x40
       softirqs last  enabled at (12566): [<ffffffff80514d2b>] lock_sock_nested+0xfb/0x110
       softirqs last disabled at (12568): [<ffffffff8058454e>] tcp_prequeue_process+0x2e/0xa0
      
      The check_flags warning of lockdep tells me that lockdep thought interrupts
      were disabled, but they were really enabled.
      
      The numbers in the above parenthesis show the order of events:
      
       12566: softirqs last enabled:  lock_sock_nested
       12567: hardirqs last enabled:  local_bh_enable
       12568: softirqs last disabled: tcp_prequeue_process
       12566: hardirqs last disabled: int3
      
      int3 is a breakpoint!
      
      Examining this further, I have CONFIG_NET_TCPPROBE enabled which adds
      break points into the kernel.
      
      The paranoid_exit of the return of int3 does not account for enabling
      interrupts on return to kernel. This code is a bit tricky since it
      is also used by the nmi handler (when lockdep is off), and we must be
      careful about the swapgs. We can not call kernel code after the swapgs
      has been performed.
      
      [ Impact: fix lockdep check_flags warning + self-turn-off ]
      Acked-by: NPeter Zijlsta <a.p.zijlstra@chello.nl>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0300e7f1
    • J
      x86: mm/numa_32.c calculate_numa_remap_pages should use __init · a81b6314
      Jaswinder Singh Rajput 提交于
      calculate_numa_remap_pages() is called only by __init initmem_init()
      further calculate_numa_remap_pages is calling:
      	__init find_e820_area() and __init reserve_early()
      
      So calculate_numa_remap_pages() should be __init calculate_numa_remap_pages().
      
       WARNING: arch/x86/built-in.o(.text+0x82ea3): Section mismatch in reference from the function calculate_numa_remap_pages() to the function .init.text:find_e820_area()
       The function calculate_numa_remap_pages() references
       the function __init find_e820_area().
       This is often because calculate_numa_remap_pages lacks a __init
       annotation or the annotation of find_e820_area is wrong.
      
       WARNING: arch/x86/built-in.o(.text+0x82f5f): Section mismatch in reference from the function calculate_numa_remap_pages() to the function .init.text:reserve_early()
       The function calculate_numa_remap_pages() references
       the function __init reserve_early().
       This is often because calculate_numa_remap_pages lacks a __init
       annotation or the annotation of reserve_early is wrong.
      
      [ Impact: save memory, address Section mismatch warning ]
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      LKML-Reference: <1239991281.3153.4.camel@ht.satnam>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a81b6314
    • H
      x86, kbuild: make "make install" not depend on vmlinux · 1648e4f8
      H. Peter Anvin 提交于
      It is common to use "make install" in restricted environments which
      differ from the one which was actually used to build the kernel.  In
      such environments it is highly undesirable to trigger a rebuild of any
      part of the system.  Worse, the rebuild may be spurious, triggered by
      differences in the environment.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      LKML-Reference: <20090415234642.GA28531@uranus.ravnborg.org>
      1648e4f8
    • J
      x86/uv: fix init of cpu-less nodes · 27229ca6
      Jack Steiner 提交于
      Fix an endcase in the UV initialization code for the "UV large system mode"
      of apicids.  If node zero contains no cpus, cpus on another node will be the
      boot cpu.  The percpu data that contains the extra apicid bits was not
      being initialized early enough.
      
      [ Impact: fix potential boot crash on cpu-less UV nodes ]
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090417142447.GA23759@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      27229ca6
    • J
      x86/uv: fix init of memory-less nodes · dc098551
      Jack Steiner 提交于
      Add support for nodes that have cpus but no memory.
      The current code was failing to add these nodes
      to the nodes_present_map.
      
      v2: Fixes case caught by David Rientjes - missed support
          for the x2apic SRAT table.
      
      [ Impact: fix potential boot crash on memory-less UV nodes. ]
      Reported-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090417142242.GA23743@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dc098551
  8. 17 4月, 2009 4 次提交
    • Y
      x86/irq: mark NUMA_MIGRATE_IRQ_DESC broken · ca713c2a
      Yinghai Lu 提交于
      It causes crash on system with lots of cards with MSI-X
      when irq_balancer enabled...
      
      The patches fixing it were both complex and fragile, according
      to Eric they were also doing quite dangerous things to the
      hardware.
      
      Instead we now have patches that solve this problem via static
      NUMA node mappings - not dynamic allocation and balancing.
      
      The patches are much simpler than this method but are still too
      large outside of the merge window, so we mark the dynamic balancer
      as broken for now, and queue up the new approach for v2.6.31.
      
      [ Impact: deactivate broken kernel feature ]
      Reported-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      LKML-Reference: <49E68C41.4020801@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ca713c2a
    • D
      x86: fix microcode driver newly spewing warnings · 0917798d
      Dmitry Adamushko 提交于
      Jeff Garzik reported this WARN_ON() noise:
      
      > Kernel: 2.6.30-rc1-00306-g8371f87c
      > Hardware: ICH10 x86-64
      >
      > This is a regression from 2.6.29.  Microcode spews the following WARNING
      > multiple times during boot:
      >
      > ------------[ cut here ]------------
      > WARNING: at fs/sysfs/group.c:138 sysfs_remove_group+0xeb/0xf0()
      > Hardware name:         sysfs group ffffffffa0209700 not found for
      >  kobject 'cpu0'
      
      Keep sysfs files around for cpus even when we failed to locate
      microcode for them at the moment of module loading. The appropriate
      microcode firmware can become available later on.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0917798d
    • P
      x86, PAT: Remove page granularity tracking for vm_insert_pfn maps · 4b065046
      Pallipadi, Venkatesh 提交于
      This change resolves the problem of too many single page entries
      in pat_memtype_list and "freeing invalid memtype" errors with i915,
      reported here:
      
        http://marc.info/?l=linux-kernel&m=123845244713183&w=2
      
      Remove page level granularity track and untrack of vm_insert_pfn.
      memtype tracking at page granularity does not scale and cleaner
      approach would be for the driver to request a type for a bigger
      IO address range or PCI io memory range for that device, either at
      mmap time or driver init time and just use that type during
      vm_insert_pfn.
      
      This patch just removes the track/untrack of vm_insert_pfn. That
      means we will be in same state as 2.6.28, with respect to these APIs.
      
      Newer APIs for the drivers to request a memtype for a bigger region
      is coming soon.
      
      [ Impact: fix Xorg startup warnings and hangs ]
      Reported-by: NArkadiusz Miskiewicz <a.miskiewicz@gmail.com>
      Tested-by: NArkadiusz Miskiewicz <a.miskiewicz@gmail.com>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      LKML-Reference: <20090408223716.GC3493@linux-os.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4b065046
    • C
      x86: UV BAU distribution and payload MMRs · 4ea3c51d
      Cliff Wickman 提交于
      This patch correctly sets BAU memory mapped registers to point
      to the sending activation descriptor table and target payload table.
      
      The "Broadcast Assist Unit" is used for TLB shootdown in UV.
      
      The memory mapped registers that point to sending and receiving
      memory structures contain node numbers.
      
      In one case the __pa() function did not provide the node id of
      memory on blade zero in configurations where that id is nonzero.
      In another case, it was assumed that memory was allocated on
      the local node.  That assumption is not true in a configuration
      in which the node has no memory.
      
      Tested on the UV hardware simulator.
      
      [ Impact: fix possible runtime crash due to incorrect TLB logic ]
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      LKML-Reference: <E1LuR5Z-0007An-B8@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4ea3c51d
  9. 16 4月, 2009 1 次提交
    • I
      x86: disable X86_PTRACE_BTS for now · d45b41ae
      Ingo Molnar 提交于
      Oleg Nesterov found a couple of races in the ptrace-bts code
      and fixes are queued up for it but they did not get ready in time
      for the merge window. We'll merge them in v2.6.31 - until then
      mark the feature as CONFIG_BROKEN. There's no user-space yet
      making use of this so it's not a big issue.
      
      Cc: <stable@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d45b41ae
  10. 15 4月, 2009 3 次提交
    • L
      acpi-cpufreq: fix 'smp_call_function_many()' confusion · ea34f43a
      Linus Torvalds 提交于
      It turns out that 'smp_call_function_many()' doesn't work at all like
      'smp_call_function_single()', and my change to Andrew's patch to use it
      rather than a loop over all CPU's acpi-cpufreq doesn't work.
      
      My bad.
      
      'smp_call_function_many()' has two "features" (aka "documented bugs"):
      
       (a) it needs to be called with preemption disabled, because it uses
           smp_processor_id() without guarding the CPU lookup with 'get_cpu()'
           and 'put_cpu()' like the 'single' variant does.
      
       (b) even if the current CPU is part of the CPU mask, it won't do the
           call on that CPU.
      
      Still, we're better off trying to use 'smp_call_function_many()' than
      looping over CPU's, since it at least in theory allows us to use a
      broadcast IPI and do it all in parallel.  So let's just work around the
      silly semantic bugs in that function.
      Reported-and-tested-by: NAli Gholami Rudi <ali@rudi.ir>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Andrew Morton <akpm@linux-foundation.org>,
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ea34f43a
    • H
      x86 microcode: revert some work_on_cpu · 6f66cbc6
      Hugh Dickins 提交于
      Revert part of af5c820a ("x86: cpumask:
      use work_on_cpu in arch/x86/kernel/microcode_core.c")
      
      That change is causing only one Intel CPU's microcode to be updated e.g.
      microcode: CPU3 updated from revision 0x9 to 0x17, date = 2005-04-22
      where before it announced that also for CPU0 and CPU1 and CPU2.
      
      We cannot use work_on_cpu() in the CONFIG_MICROCODE_OLD_INTERFACE code,
      because Intel's request_microcode_user() involves a copy_from_user() from
      /sbin/microcode_ctl, which therefore needs to be on that CPU at the time.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6f66cbc6
    • C
      x86: UV: BAU partition-relative distribution map · 94ca8e48
      Cliff Wickman 提交于
      This patch enables each partition's BAU distribution bit map
      to be partition-relative.
      
      The distribution bitmap had been constructed assuming 0 as the base
      node number.  That construct would not have allowed a total system of
      greater than 256 nodes.
      It also corrects an error that occurred when the first blade's nasid
      was not zero.  That nasid was stored as the base node.
      The base node number gets added by hardware to the node numbers implied
      in the distribution bitmap, resulting in invalid target nasids.
      
      Tested on the UV hardware simulator.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      LKML-Reference: <E1Ltl0C-0004Ob-37@eag09.americas.sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      94ca8e48
  11. 14 4月, 2009 4 次提交
    • P
      x86, irq: Remove IRQ_DISABLED check in process context IRQ move · 6ec3cfec
      Pallipadi, Venkatesh 提交于
      As discussed in the thread here:
      
        http://marc.info/?l=linux-kernel&m=123964468521142&w=2
      
      Eric W. Biederman observed:
      
      > It looks like some additional bugs have slipped in since last I looked.
      >
      > set_irq_affinity does this:
      > ifdef CONFIG_GENERIC_PENDING_IRQ
      >        if (desc->status & IRQ_MOVE_PCNTXT || desc->status & IRQ_DISABLED) {
      >                cpumask_copy(desc->affinity, cpumask);
      >                desc->chip->set_affinity(irq, cpumask);
      >        } else {
      >                desc->status |= IRQ_MOVE_PENDING;
      >                cpumask_copy(desc->pending_mask, cpumask);
      >        }
      > #else
      >
      > That IRQ_DISABLED case is a software state and as such it has nothing to
      > do with how safe it is to move an irq in process context.
      
      [...]
      
      >
      > The only reason we migrate MSIs in interrupt context today is that there
      > wasn't infrastructure for support migration both in interrupt context
      > and outside of it.
      
      Yes. The idea here was to force the MSI migration to happen in process
      context. One of the patches in the series did
      
              disable_irq(dev->irq);
              irq_set_affinity(dev->irq, cpumask_of(dev->cpu));
              enable_irq(dev->irq);
      
      with the above patch adding irq/manage code check for interrupt disabled
      and moving the interrupt in process context.
      
      IIRC, there was no IRQ_MOVE_PCNTXT when we were developing this HPET
      code and we ended up having this ugly hack. IRQ_MOVE_PCNTXT was there
      when we eventually submitted the patch upstream. But, looks like I did a
      blind rebasing instead of using IRQ_MOVE_PCNTXT in hpet MSI code.
      
      Below patch fixes this. i.e., revert commit 932775a4
      and add PCNTXT to HPET MSI setup. Also removes copying of desc->affinity
      in generic code as set_affinity routines are doing it internally.
      Reported-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "Li Shaohua" <shaohua.li@intel.com>
      Cc: Gary Hade <garyhade@us.ibm.com>
      Cc: "lcm@us.ibm.com" <lcm@us.ibm.com>
      Cc: suresh.b.siddha@intel.com
      LKML-Reference: <20090413222058.GB8211@linux-os.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6ec3cfec
    • L
      Fix quilt merge error in acpi-cpufreq.c · 1c98aa74
      Linus Torvalds 提交于
      We ended up incorrectly using '&cur' instead of '&readin' in the
      work_on_cpu() -> smp_call_function_single() transformation in commit
      01599fca ("cpufreq: use
      smp_call_function_[single|many]() in acpi-cpufreq.c").
      
      Andrew explains:
       "OK, the acpi tree went and had conflicting changes merged into it after
        I'd written the patch and it appears that I incorrectly reverted part
        of 18b2646f while fixing the resulting
        rejects.
      
        Switching it to `readin' looks correct."
      Acked-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1c98aa74
    • J
      x86: pci-swiotlb.c swiotlb_dma_ops should be static · ff6c6fed
      Jaswinder Singh Rajput 提交于
      Impact: reduce kernel size a bit, address sparse warning
      
      Addresses the problem pointed out by this sparse warning:
      
        arch/x86/kernel/pci-swiotlb.c:53:20: warning: symbol 'swiotlb_dma_ops' was not declared. Should it be static?
      
      For x86: swiotlb_dma_ops can be static, because it's not used outside
      of pci-swiotlb.c
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Acked-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      LKML-Reference: <1239558861.3938.2.camel@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ff6c6fed
    • A
      cpufreq: use smp_call_function_[single|many]() in acpi-cpufreq.c · 01599fca
      Andrew Morton 提交于
      Atttempting to rid us of the problematic work_on_cpu().  Just use
      smp_call_fuction_single() here.
      
      This repairs a 10% sysbench(oltp)+mysql regression which Mike reported,
      due to
      
        commit 6b44003e
        Author: Andrew Morton <akpm@linux-foundation.org>
        Date:   Thu Apr 9 09:50:37 2009 -0600
      
            work_on_cpu(): rewrite it to create a kernel thread on demand
      
      It seems that the kernel calls these acpi-cpufreq functions at a quite
      high frequency.
      
      Valdis Kletnieks also reports that this causes 70-90 forks per second on
      his hardware.
      
      Cc: Valdis.Kletnieks@vt.edu
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Zhao Yakui <yakui.zhao@intel.com>
      Acked-by: NDave Jones <davej@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Tested-by: NMike Galbraith <efault@gmx.de>
      Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      [ Made it use smp_call_function_many() instead of looping over cpu's
        with smp_call_function_single()    - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01599fca
  12. 12 4月, 2009 3 次提交
    • S
      x86: add linux kernel support for YMM state · a30469e7
      Suresh Siddha 提交于
      Impact: save/restore Intel-AVX state properly between tasks
      
      Intel Advanced Vector Extensions (AVX) introduce 256-bit vector processing
      capability. More about AVX at http://software.intel.com/sites/avx
      
      Add OS support for YMM state management using xsave/xrstor infrastructure
      to support AVX.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1239402084.27006.8057.camel@localhost.localdomain>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a30469e7
    • M
      x86: fix wrong section of pat_disable & make it static · 1ee4bd92
      Marcin Slusarz 提交于
      pat_disable cannot be __cpuinit anymore because it's called from pat_init
      and the callchain looks like this:
      pat_disable [cpuinit] <- pat_init <- generic_set_all <-
       ipi_handler <- set_mtrr <- (other non init/cpuinit functions)
      
      WARNING: arch/x86/mm/built-in.o(.text+0x449e): Section mismatch in reference
      from the function pat_init() to the function .cpuinit.text:pat_disable()
      The function pat_init() references
      the function __cpuinit pat_disable().
      This is often because pat_init lacks a __cpuinit
      annotation or the annotation of pat_disable is wrong.
      
      Non CONFIG_X86_PAT version of pat_disable is static inline, so this version
      can be static too (and there are no callers outside of this file).
      Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
      Acked-by: NSam Ravnborg <sam@ravnborg.org>
      LKML-Reference: <49DFB055.6070405@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1ee4bd92
    • R
      x86: Fix section mismatches in mpparse · 57592224
      Rakib Mullick 提交于
      Impact: fix section mismatch
      
      In arch/x86/kernel/mpparse.c, smp_reserve_bootmem() has been called
      and also refers to a function which is in .init section. Thus causes
      the first warning. And check_irq_src() also requires an __init,
      because it refers to an .init section.
      Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <b9df5fa10904102004g51265d9axc8d07278bfdb6ba0@mail.gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      57592224